Isarn Dharma Alphabets Lexicon For Language Processing

被引:0
|
作者
Phaiboon, Nongnud [1 ]
Seresangtakul, Pusadee [1 ]
机构
[1] Khon Kaen Univ, Fac Sci, Dept Comp Sci, Khon Kaen, Thailand
关键词
Isarn Dharma Alphabets; Lexicon; NLP;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Lexicon is a collection of individual words in the language, which is essential for NLP (Natural Language Processing) research such as machine translation, word segmentation and speech processing. According to the computerize system applying to Isarn Dharma Alphabets, this research aims to collect important features to support research in natural language and speech processing field. In the study, Isarn Dharma Alphabets lexicon using Trie structure was constructed. The lexicon consists of Isarn Dharma Alphabets words, Thai words, English words, phonemes, parts of speech, sub-parts of speech, special characteristics, Thai descriptions, and English descriptions. The lexicon contains approximately 8,000 words. Moreover, Isarn Dharma Alphabets transcription system has been proposed based on linguistic rules.
引用
收藏
页码:211 / 215
页数:5
相关论文
共 50 条