nltk
https://github.com/nltk/nltk
Python
NLTK Source
Triage Issues!
When you volunteer to triage issues, you'll receive an email each day with a link to an open issue that needs help in this project. You'll also receive instructions on how to triage issues.
Triage Docs!
Receive a documented method or class from your favorite GitHub repos in your inbox every day. If you're really pro, receive undocumented methods or classes and supercharge your commit history.
Python not yet supported38 Subscribers
View all SubscribersAdd a CodeTriage badge to nltk
Help out
- Issues
- 10x Faster Levenshtein Distances
- Make WordNet's synset relations available from the lemmas
- Fix #3124- bug with PickleCorpusView raising UnicodeDecodeError
- Avoid recursive suffix stripping in wordnet morphy
- Implement vocabulary introduction for texttiling
- fix for word_tokenize() Failing to Split English Contractions When Followed by [\t\n\f\r]
- KneserNeyInterpolated has problem with OOV words during testing and perplexity is always inf
- `TreebankWordDetokenizer().detokenize()` introduces unexpected spaces before periods.
- Refactor LanguageModel class, adding split functionality and unit tests
- Tokenizer punkt zip file sometimes does not unpackage
- Docs
- Python not yet supported