Enhancing HMM-based POS tagger for Mizo language

被引:0
|
作者
Nunsanga, Morrel V. L. [1 ]
Pakray, Partha [2 ]
Devi, Toijam Sonalika [1 ]
Singh, L. Lolit Kr [3 ]
机构
[1] Mizoram Univ, Dept Informat Technol, Mizoram 796004, India
[2] NIT Silchar, Dept CSE, Silchar, Assam, India
[3] Mizoram Univ, Dept ECE, Mizoram, India
关键词
Hybrid POS tagger; rule-based POS tagger; N-gram tagger; Mizo POS tagger; Hidden Markov Model;
D O I
10.3233/JIFS-224220
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The process of associating words with their relevant parts of speech is known as part-of-speech (POS) tagging. It takes a substantial amount of well-organized data or corpora and significant target language research to obtain good performance for a tagger. Mizo is a language that needs more research attention in computational linguistics due to its under-resourced nature. The limited availability of corpora and relevant literature adds complexity to the task of assigning POS labels to Mizo text. This paper explores two methods to potentially improve the Hidden Markov Model (HMM)-based POS tagger for the Mizo language. The proposed taggers are compared with the baseline HMM tagger and the N-gram taggers on the designed Mizo corpus, which consists of 72,077 manually tagged tokens. The experimental results proved that the two proposed taggers enhanced the HMM-based Mizo POS tagger, achieving 81.52% and 84.29% accuracy, respectively. Moreover, a comprehensive analysis of the performance of the suggested hybrid tagger was conducted, yielding a weighted average precision, recall, and F1-score of 83.09%, 77.88%, and 79.64% respectively.
引用
收藏
页码:11725 / 11736
页数:12
相关论文
共 50 条
  • [1] Using Synthetic Clinical Data to Train an HMM-Based POS Tagger
    Knoll, Benjamin C.
    Melton, Genevieve B.
    Liu, Hongfang
    Xu, Hua
    Pakhomov, Serguei V. S.
    2016 3RD IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS, 2016, : 252 - 255
  • [2] Designing HMM-based part-of-speech tagger for Lithuanian language
    Pajarskaite, G
    Griciute, V
    Raskinis, G
    Kuper, J
    INFORMATICA, 2004, 15 (02) : 231 - 242
  • [3] Language model based on POS tagger
    Ziolko, Bartosz
    Manandhar, Suresh
    Wilson, Richard C.
    Ziolko, Mariusz
    SIGMAP 2008: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND MULTIMEDIA APPLICATIONS, 2008, : 177 - +
  • [4] Bidirectional HMM-based Arabic POS tagging
    Kadim, Ayoub
    Lazrek, Azzeddine
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (02) : 303 - 312
  • [5] HMM based POS tagger and rule-based chunker for Bengali
    Bandyopadhyay, Sivaji
    Ekbal, Asif
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 2007, : 384 - +
  • [6] Named entity recognition using an HMM-based chunk tagger
    Zhou, GD
    Su, J
    40TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, 2002, : 473 - 480
  • [7] Unknown word processing in HMM-based POS tagging
    Zhang, Xiaofei
    Huang, Heyan
    Zhang, Daoyang
    RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 110 - 113
  • [8] A Hybrid of Rule-based and HMM-based Part-of-Speech Tagger for Indonesian
    Ananda, Muhammad Ridho
    Hanifmuti, Muhammad Yudistira
    Alfina, Ika
    2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 280 - 285
  • [9] A HMM POS Tagger for Micro-blogging Type Texts
    Nand, Parma
    Perera, Rivindu
    Lal, Ramesh
    PRICAI 2014: TRENDS IN ARTIFICIAL INTELLIGENCE, 2014, 8862 : 157 - 169
  • [10] A HMM POS tagger for micro-blogging type texts
    School of Computer and Mathematical Sciences, Auckland University of Technology, Auckland
    1010, New Zealand
    Lect. Notes Comput. Sci., (157-169):