Predicting Morphologically-Complex Unknown Words in Igbo

被引:4
|
作者
Onyenwe, Ikechukwu E. [1 ]
Hepple, Mark [1 ]
机构
[1] Univ Sheffield, Dept Comp Sci, NLP Grp, Sheffield, S Yorkshire, England
来源
TEXT, SPEECH, AND DIALOGUE | 2016年 / 9924卷
关键词
Morphology; Morphological reconstruction; Igbo; Unknown words prediction; Part-of-speech tagging;
D O I
10.1007/978-3-319-45510-5_24
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The effective handling of previously unseen words is an important factor in the performance of part-of-speech taggers. Some trainable POS taggers use suffix (sometimes prefix) strings as cues in handling unknown words (in effect serving as a proxy for actual linguistic affixes). In the context of creating a tagger for the African language Igbo, we compare the performance of some existing taggers, implementing such an approach, to a novel method for handling morphologically complex unknown words, based on morphological reconstruction (i.e. a linguistically-informed segmentation into root and affixes). The novel method outperforms these other systems by several percentage points, achieving accuracies of around 92% on morphologically-complex unknown words.
引用
收藏
页码:206 / 214
页数:9
相关论文
共 50 条