Releases: PyThaiNLP/pythainlp
Releases · PyThaiNLP/pythainlp
PyThaiNLP 2.1.dev5
- Change from
marisa-trieto a Trie implementation written in python
PyThaiNLP 2.1.dev4
Merge pull request #273 from PyThaiNLP/ner-tag Add test cases for NER
PyThaiNLP 2.0.7
PyThaiNLP 2.0.7 Release
change log
- Bug fix: Include case THANTHAKHAT and SARA U, UU too (pythainlp.util.normalize) #244
Upgrade : pip install -U pythainlp
Docs : https://thainlp.org/pythainlp/docs/2.0/
User guide: https://github.com/PyThaiNLP/pythainlp/blob/dev/notebooks/pythainlp-get-started.ipynb
PyThaiNLP 2.1.dev2
Update Version
PyThaiNLP 2.0.6
- fixed #230
- new train ThaiNER
PyThaiNLP 2.0.5
- Clean word lists in
pythainlp.corpus(remove duplicates, etc.) - Fix/add return type hinting for functions in
pythainlp.corpus - Fix deprecated inline flag for regular expression in
pythainlp.corpus.tnc(Thai National Corpus) - Bug fix: reorder condition checks in
pythainlp.tokenize.dict_trieso it catchTriebeforeIterable
PyThaiNLP 2.0.4
word_tokenize()'s argumentwhitespacesis nowkeep_whitespaceto make is more explicit, default behavior is to keep whitespacesword_tokenize()can now take a custom dictionary throughtcustom_dictparameterdict_word_tokenize()will be deprecated soon
PyThaiNLP 2.0.3
- Fix TCC (Thai Textbook Corpus) corpus always downloading new file issue
- Words and their frequencies from TTC (Thai Textbook Corpus) now has a local copy at
ttc_freq.txtinsidepythainlp.corpus. - Other refactoring and code improvements, including ones related to subword tokenization (Thai Character Cluster / TCC and ETCC), see #193
PyThaiNLP 2.0.2
- Fixed tree map
- Subword tokeniser documentation improvement #190