llm.c
https://github.com/karpathy/llm.c
Cuda
LLM training in simple, raw C/CUDA
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Cuda not yet supported2 Subscribers
Add a CodeTriage badge to llm.c
Help out
- Issues
- GELU Fusion with cuBLASLt (SLOWER because it only merges in FP16 mode, not BF16/FP32...)
- Experimenting with global instantiation for the layouts
- Overlap gradient computation and NCCL AllReduce
- Make ceil_div constexpr and inline
- weight reordering: attempt 1
- [dev/cuda] Added warpsize as a constant expr for dev/cuda files
- Add DockerFile
- Added constexpr for blocksizes to optimize compilation
- Added additional layernorm forward kernel that does not recalculate mean and rstd
- Removing B in encoder and replacing with calculated N in kernel
- Docs
- Cuda not yet supported