![]() ![]() Korakot Chaovavanich - initial tokenization and soundex codes.Wannaphong Phatthiyaphaibun - foundation, distribution and maintenance.Please see our tutorials on how to apply these functions to machine-learning problems. Command-line interface for basic functions, like tokenization and pos tagging (run thainlp in your shell).Thai-English keyboard misswitched fix (eng_to_thai, thai_to_eng).Thai datetime formatting (thai_strftime).Read out number to Thai words (bahttext, num_to_thaiword).Thai collation (sort by dictionary order) (collate).Thai soundex (soundex) with three engines (lk82, udom83, metasound).Thai spelling suggestion and correction (spell and correct).Thai linguistic unit segmentation/tokenization, including sentence (sent_tokenize), word (word_tokenize), and subword segmentations based on Thai Character Cluster (subword_tokenize).Convenient character and word classes, like Thai consonants (pythainlp.thai_consonants), vowels (pythainlp.thai_vowels), digits (pythainlp.thai_digits), and stop words (_stopwords) – comparable to constants like string.letters, string.digits, and string.punctuation.PyThaiNLP is a Python package for text processing and linguistic analysis, similar to nltk, with focus on Thai language. We build softwares and datasets for Thai language. The PyThaiNLP Project is a Thai Natural Language Processing project. Welcome to The Official PyThaiNLP Project Website. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |