An Adaptive Harmony Search Part-of-Speech tagger for Square Hmong Corpus

被引:1
|
作者
Kang, Di -Wen [1 ]
Ye, Shao-Qiang [2 ,4 ]
Ahmad, Sharifah Zarith Rahmah Syed [2 ]
Mo, Li-Ping [3 ]
Qin, Feng [2 ]
Zhou, Pan [1 ]
机构
[1] Jishou Univ, Sch Commun & Elect Engn, Jishou 416000, Peoples R China
[2] Univ Teknol Malaysia, Fac Comp, Skudai 80310, Johor, Malaysia
[3] Jishou Univ, Coll Comp Sci & Engn, Jishou, Hunan, Peoples R China
[4] Hunan Appl Technol Univ, Coll Informat & Engn, Changde 415000, Hunan, Peoples R China
关键词
Harmony Search Algorithm; Low-resource language; Optimization; Part-of-Speech tagging; Unknown words; ALGORITHM;
D O I
10.21123/bsj.2024.9694
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Data -driven models perform poorly on part -of -speech tagging problems with the square Hmong language, a low -resource corpus. This paper designs a weight evaluation function to reduce the influence of unknown words. It proposes an improved harmony search algorithm utilizing the roulette and local evaluation strategies for handling the square Hmong part -of -speech tagging problem. The experiment shows that the average accuracy of the proposed model is 6%, 8% more than HMM and BiLSTM-CRF models, respectively. Meanwhile, the average F1 of the proposed model is also 6%, 3% more than HMM and BiLSTM-CRF models, respectively.
引用
收藏
页码:622 / 632
页数:11
相关论文
共 50 条
  • [1] Development of a multilingual parallel corpus and a part-of-speech tagger for Afrikaans
    Trushkina, Julia
    Intelligent Information Processing III, 2006, 228 : 453 - 462
  • [2] Implementing an efficient part-of-speech tagger
    Carlberger, J
    Kann, V
    SOFTWARE-PRACTICE & EXPERIENCE, 1999, 29 (09): : 815 - 832
  • [3] An Accurate Persian Part-of-Speech Tagger
    Okhovvat, Morteza
    Sharifi, Mohsen
    Bidgoli, Behrouz Minaei
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 423 - 430
  • [4] A Practical Part-of-Speech Tagger for Bengali
    Sarkar, Kamal
    Gayen, Vivekananda
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 36 - 40
  • [5] An Efficient Part-of-Speech Tagger for Arabic
    Kopru, Selcuk
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 202 - 213
  • [6] TnT - A statistical part-of-speech tagger
    Brants, T
    6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 224 - 231
  • [7] A hybrid part-of-speech tagger with annotated Kurdish corpus: advancements in POS tagging
    Maulud, Dastan
    Jacksi, Karwan
    Ali, Ismael
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2023, 38 (04) : 1604 - 1612
  • [8] Tamil Part-of-Speech tagger based on SVMTool
    Dhanalakshmi, V
    Anandkumar, M.
    Vijaya, M. S.
    Loganathan, R.
    Soman, K. P.
    Rajendran, S.
    RECENT ADVANCES OF ASIAN LANGUAGE PROCESSING TECHNOLOGIES, 2008, : 59 - +
  • [9] Toward an Effective Igbo Part-of-Speech Tagger
    Onyenwe, Ikechukwu E.
    Hepple, Mark
    Chinedu, Uchechukwu
    Ezeani, Ignatius
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)
  • [10] A suffix based part-of-speech tagger for Turkish
    Dincer, Taner
    Karaoglan, Bahar
    Kisla, Tarik
    PROCEEDINGS OF THE FIFTH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY: NEW GENERATIONS, 2008, : 680 - +