nanogpt
https://github.com/karpathy/nanogpt
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
not yet supported1 Subscribers
Add a CodeTriage badge to nanogpt
Help out
- Issues
- Slurm cluster
- Model fine-tuned to GPT2 generates occasional <|endoftext|> token
- How to change vocabulary size/number of tokens when tokenizing openwebtext?
- Make tiktoken install optional
- Is it worth it to add the FAVOR+ algorithm from Performers?
- python sample.py output is empty for trainning with cuda device but is ok with cpu
- eval_gpt2 error: missmatch for transformers.h* copying a param with shape * from checkpoint, the shape ...
- Add some opinionated guide for fine-tuning
- Does the order of weight decay paramerters matter?
- Support for Model Parallelism for Large-scale Models
- Docs
- not yet supported