11![ PyThaiNLP Logo] ( https://avatars0.githubusercontent.com/u/32934255?s=200&v=4 )
22
3- # PyThaiNLP 1.8 .0
3+ # PyThaiNLP 2 .0
44
55[ ![ Codacy Badge] ( https://api.codacy.com/project/badge/Grade/cb946260c87a4cc5905ca608704406f7 )] ( https://www.codacy.com/app/pythainlp/pythainlp_2?utm_source=github.com& ; utm_medium=referral& ; utm_content=PyThaiNLP/pythainlp& ; utm_campaign=Badge_Grade ) [ ![ pypi] ( https://img.shields.io/pypi/v/pythainlp.svg )] ( https://pypi.python.org/pypi/pythainlp )
66[ ![ Build Status] ( https://travis-ci.org/PyThaiNLP/pythainlp.svg?branch=develop )] ( https://travis-ci.org/PyThaiNLP/pythainlp )
@@ -12,24 +12,51 @@ PyThaiNLP is a Python library for natural language processing (NLP) of Thai lang
1212
1313PyThaiNLP includes Thai word tokenizers, transliterators, soundex converters, part-of-speech taggers, and spell checkers.
1414
15- ## What's new in version 1.8 ?
15+ ## What's new in version 2.0 ?
1616
1717- New NorvigSpellChecker spell checker class, which can be initialized with custom dictionary.
1818- Terminate Python 2 support. Remove all Python 2 compatibility code.
1919- Remove old, obsolated, deprecated, and experimental code.
20- - see [ PyThaiNLP 1.8 change log] ( https://github.com/PyThaiNLP/pythainlp/issues/118 )
20+ - Thai2fit (Upgrade ULMFiT-related codes to fastai 1.0)
21+ - ThaiNER 1.0
22+ - Remove sentiment analysis
23+ - Improved word_tokenize (newmm, mm) and dict_word_tokenize
24+ - Improved POS-tagging
25+ - More and improved examples
26+ - see [ PyThaiNLP 2.0 change log] ( https://github.com/PyThaiNLP/pythainlp/issues/118 )
2127
2228## Install
2329
30+ For stable version:
31+
2432``` sh
2533pip install pythainlp
2634```
2735
36+ For some advanced functionalities, like word vector, extra packages may be needed. Install them with these options during pip install:
37+
38+ ```
39+ pip install pythainlp[extra1,extra2,...]
40+ ```
41+
42+ where extras can be
43+
44+ - ` artagger ` (to support artagger part-of-speech tagger)*
45+ - ` deepcut ` (to support deepcut machine-learnt tokenizer)
46+ - ` icu ` (for ICU support in transliteration and tokenization)
47+ - ` ipa ` (for International Phonetic Alphabet support in transliteration)
48+ - ` ml ` (to support fastai 1.0.22 ULMFiT models)
49+ - ` ner ` (for named-entity recognizer)
50+ - ` thai2fit ` (for Thai word vector)
51+ - ` thai2rom ` (for machine-learnt romanization)
52+ - ` full ` (install everything)
53+
2854** Note for Windows** : ` marisa-trie ` wheels can be obtained from https://www.lfd.uci.edu/~gohlke/pythonlibs/#marisa-trie
2955Install it with pip, for example: ` pip install marisa_trie‑0.7.5‑cp36‑cp36m‑win32.whl `
3056
3157## Links
3258
33- - Docs: https://thainlp.org/pythainlp/docs/1.7/
59+ - User guide : [ English] ( https://colab.research.google.com/drive/1MQ10D1mJC5r1vQAHcj4ShoRS14vz8ZF- ) , [ ภาษาไทย] ( https://colab.research.google.com/drive/1rEkB2Dcr1UAKPqz4bCghZV7pXx2qxf89 )
60+ - Docs: https://thainlp.org/pythainlp/docs/2.0/
3461- GitHub: https://github.com/PyThaiNLP/pythainlp
3562- Issues: https://github.com/PyThaiNLP/pythainlp/issues
0 commit comments