text-generation-inference
https://github.com/huggingface/text-generation-inference
Python
Large Language Model Text Generation Inference
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported1 Subscribers
Add a CodeTriage badge to text-generation-inference
Help out
- Issues
- Update quantization kernels
- How to detect watermark?
- fix: enable defs references in tool calls
- Attempt to fix CI errors
- Vision encoder warmup fails with CPU vs CUDA mismatch (F.linear() input_tensor on CPU, weight on CUDA)
- PermissionError: [Errno 13] Permission denied: '/data' when deploying TGI on Hugging Face Spaces
- OpenTelemetry support for http endpoints
- Retrieve the correct cached model batch size in Neuron config checker for Neuron Backend
- Faster (dynamic) grammar compilation
- Deploying Gemma-3-1b-it with NVIDIA GPU P2000 - gets error
- Docs
- Python not yet supported