A Spell Checker for a Low-resourced and Morphologically Rich Language

被引:0
|
作者
Octaviano, Manolito, Jr. [1 ]
Borra, Allan [1 ]
机构
[1] De La Salle Univ, Coll Comp Studies, Manila, Philippines
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Spell checking plays an important role in improving the quality of documents by identifying misspelled words in the document. There are various efforts made towards advancement of spell checkers on other languages such as in English that has almost perfected spell checking system (e.g. Microsoft Word). However, few efforts were made to develop an efficient Filipino spell checker. One major challenge of existing Filipino spell checkers, being dictionary-based, is the lack of a complete dictionary to capture all inflected forms (e.g. isinasama 'including', isasama 'will be included', and isinama 'included' with the base form sama 'include'), borrowing (e.g. magtex 'to text' and nagtex 'texted'), and code-switching (e.g. magtext 'to text', and nag-text 'texted' with the base form 'text') of a word. In addition, existing systems cannot handle code switching wherein valid words are being marked as erroneous. In this research, a spell checking is designed for Filipino low-resourced morphologically rich language. It detects and corrects typographical errors in the language and introduces a modified version of metaphone algorithm for ranking the candidate suggestions. The system results to 81% recall, 53.64% precision, 64.53% f-measure, and 87.78% suggestion adequacy on 100 sentences taken from exercise documents of Filipino students.
引用
收藏
页码:1853 / 1856
页数:4
相关论文
共 50 条
  • [31] BERT-Based Sentiment Analysis for Low-Resourced Languages: A Case Study of Urdu Language
    Ashraf, Muhammad Rehan
    Jana, Yasmeen
    Umer, Qasim
    Jaffar, M. Arfan
    Chung, Sungwook
    Ramay, Waheed Yousuf
    IEEE ACCESS, 2023, 11 : 110245 - 110259
  • [32] Navigating the Job Search as a Low-Resourced Job Seeker
    Wheeler, Earnest
    Dillahunt, Tawanna R.
    PROCEEDINGS OF THE 2018 CHI CONFERENCE ON HUMAN FACTORS IN COMPUTING SYSTEMS (CHI 2018), 2018,
  • [33] Morphology-Based Spell Checker for Dawurootsuwa Language
    Gamu D.T.
    Woldeyohannis M.M.
    Scientific Programming, 2023, 2023
  • [34] Deep learning based spell checker for Malayalam language
    Sooraj, S.
    Manjusha, K.
    Kumar, M. Anand
    Soman, K. P.
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 34 (03) : 1427 - 1434
  • [35] EFFECTIVE KEYWORD SEARCH FOR LOW-RESOURCED CONVERSATIONAL SPEECH
    Lileikyte, Rasa
    Fraga-Silva, Thiago
    Lamel, Lori
    Gauvain, Jean-Luc
    Laurent, Antoine
    Huang, Guangpu
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 5785 - 5789
  • [36] Multilingual Neural Semantic Parsing for Low-Resourced Languages
    Xia, Menglin
    Monti, Emilio
    10TH CONFERENCE ON LEXICAL AND COMPUTATIONAL SEMANTICS (SEM 2021), 2021, : 185 - 194
  • [37] Pressure ulcer management in disasters in low-resourced countries
    Rathore, Farooq A.
    Mansoor, Sahibzada Nasir
    OSTOMY WOUND MANAGEMENT, 2013, 59 (02) : 8 - 8
  • [38] Hazard Vulnerabilities Analysis in the Low-Resourced Global Setting
    Dinberu, Muluwork Tefera
    Kebede, Senait
    Berkowitz, Tal
    Greenky, David
    DISASTER MEDICINE AND PUBLIC HEALTH PREPAREDNESS, 2025, 19
  • [39] Acoustic Modeling with Bootstrap and Restructuring for Low-resourced Languages
    Cui, Xiaodong
    Xue, Jian
    Dognin, Pierre L.
    Chaudhari, Upendra V.
    Zhou, Bowen
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4, 2010, : 2974 - 2977
  • [40] Towards Indian language spell-checker design
    Chaudhuri, BB
    LANGUAGE ENGINEERING CONFERENCE, PROCEEDINGS, 2003, : 139 - 146