The aim of this article is to investigate how (i) n-gram analysis and (ii) the application of grammatical rules can improve the lexical recall of the spelling checker for Sesotho sa Leboa developed by the Centre for Text Technology. North-West University in cooperation with the Department of African Languages at the University of Pretoria. It will be shown that for a disjunctively written language like Sesotho sa Leboa lexical recall exceeding 95% can be obtained by using a list of frequently occurring words. The paper will first investigate the efficiency of using grapheme-based n-gram models in the spellchecking procedure. Second. it will discuss the utilization of grammatical rules to increase lexical recall, focusing on nominal constructions such as the diminutive. locative and augmentative. and also on verbal suffixes and suffix combinations.
机构:
North West Univ, South African Ctr Digital Language Resources SADi, Potchefstroom, South AfricaNorth West Univ, South African Ctr Digital Language Resources SADi, Potchefstroom, South Africa