llm.c
https://github.com/karpathy/llm.c
Cuda
LLM training in simple, raw C/CUDA
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Cuda not yet supported2 Subscribers
Add a CodeTriage badge to llm.c
Help out
- Issues
- 7-8% speedup: optimize matmul_backward_bias_kernel, reduce cast ops, improve loop unrolling, direct var use
- Add explicit HuggingFace cache dir
- dev/download_starter_pack.sh: adding SIGINT trap and current download…
- Improve kernel in layerNorm forward: adapt variance estimation method from kernel 4 for use in kernel 6
- Specify torch version number in requirements.txt ?
- Add optimized GPU kernels for encoder_backward using shared memory
- Add high perf mode
- MPI run with 8 GPU fails
- Fix sizing typo in `train_gpt2_fp32.cu`
- llm.c for inference
- Docs
- Cuda not yet supported