deepspeed
https://github.com/microsoft/deepspeed
Python
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported6 Subscribers
Add a CodeTriage badge to deepspeed
Help out
- Issues
- Adding DS Feature API in accelerator
- uniform deepspeed overflow check
- [XPU] support op builder from intel_extension_for_pytorch kernel path
- [BUG] Trying to finetune mistral using deepspeed but running into an error: Error building extension 'cpu_adam'
- enable yuan autotp & add conv tp
- [BUG] No `universal_checkpoint_info` in the Accelerate+Deepspeed Checkpoint
- Optimize zero3 fetch params using all_reduce
- [BUG] Tensors are on different devices when model.step()
- When using pure DeepSpeed ulysses and zero stage 3 to continue pre-training, the loss gap between each GPU is too large.[BUG]
- [BUG] Gradient Accumulation Steps Initialization Bug in Pipeline Parallel Mode
- Docs
- Python not yet supported