text-generation-inference
https://github.com/huggingface/text-generation-inference
Python
Large Language Model Text Generation Inference
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported1 Subscribers
Add a CodeTriage badge to text-generation-inference
Help out
- Issues
- Shared volume using mountpoint-s3, permissions issues
- Error "Failed to buffer the request body: length limit exceeded" when supplying base64 encoded images greater than 1MB in prompt
- Request failed during generation: Server error: 'FlashMixtral' object has no attribute 'compiled_model'
- Unable to start TGI with llama3-70b
- The EETQ quantization model cannot be performed locally
- Take into account num_return_sequences to get multiple outputs
- Inference error for Mistral7b v-0.2 while deploying in Azure VM
- Help me to add NLLB
- The settings of top_k, typical_p, do_sample in the request do not affect the generation?
- The transformation between repetition_penalty and presence_penalty seems to be incorrect
- Docs
- Python not yet supported