aphrodite-engine
https://github.com/pygmalionai/aphrodite-engine
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
not yet supported1 Subscribers
Add a CodeTriage badge to aphrodite-engine
Help out
- Issues
- [Bug]: Docker instance doesn't download model (affects VLLM as well)
- Reduce peak memory for prompt_logprobs requests
- Revert "feat: add support for chunked prefill + prefix caching (#871)"
- Add Repetition Range ('rep_range')
- [Misc]: should we be using flashinfer for CUDA 12.1 or 12.4?
- [Feature]: add back exl2 support?
- [Bug]: Error at Custom KoboldAI Endpoint! The custom endpoint failed to respond correctly. You may wish to try a different URL or API type.
- lora: add scaling factor support for LoRA at runtime
- [Bug]: ModuleNotFoundError: No module named 'ray'
- [Bug]: Generation sometimes slows to a crawl for all requests when there is a DRY sampler request
- Docs
- not yet supported