deepspeed
https://github.com/microsoft/deepspeed
Python
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported6 Subscribers
Add a CodeTriage badge to deepspeed
Help out
- Issues
- Revert "stage3: efficient compute of scaled_global_grad_norm (#5256)"
- Update ds-chat CI workflow paths to include zero stage 1-3 files
- Fix torch.compile error for PyTorch v2.3
- Fix crash when creating Torch tensor on NPU with device=get_accelerator().current_device()
- how to gather checkpoints to master node during multi-nodes training
- Fix compile wrapper
- Un-pin torch version in nv-torch-latest back to latest
- [REQUEST] i want to know how to merge deepspeed multi gpu optim file into one pytorch optim.pt file ?
- [REQUEST] Remove scary warnings from deepspeed import
- [REQUEST] How can one specify the CPU architecture to target.
- Docs
- Python not yet supported