aphrodite-engine
https://github.com/pygmalionai/aphrodite-engine
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
not yet supported1 Subscribers
Add a CodeTriage badge to aphrodite-engine
Help out
- Issues
- gguf: optimize prefill speeds for Q4_K quants
- [WIP] feat: ExLlamaV3 quantization format
- [Kernel] feat: add Metal support for Apple Silicon GPU
- [Kernel][Experimental] feat: add Vulkan backend
- [Kernel] feat: add custom CUDA kernels for all sampling ops
- fix: priority scheduler crashing in V1 scheduler under high load
- fix: illegal memory access for FP8 MoE models in cutlass 3x grouped gemm kernel
- [Kernel] feat: sage attention support
- [Kernel][Quantization] feat: add Gluon kernels for AWQ quantization
- [Feature]: Support for Triton attention backend for inference
- Docs
- not yet supported