An Adaptive Harmony Search Part-of-Speech tagger for Square Hmong Corpus

被引:1
|
作者
Kang, Di -Wen [1 ]
Ye, Shao-Qiang [2 ,4 ]
Ahmad, Sharifah Zarith Rahmah Syed [2 ]
Mo, Li-Ping [3 ]
Qin, Feng [2 ]
Zhou, Pan [1 ]
机构
[1] Jishou Univ, Sch Commun & Elect Engn, Jishou 416000, Peoples R China
[2] Univ Teknol Malaysia, Fac Comp, Skudai 80310, Johor, Malaysia
[3] Jishou Univ, Coll Comp Sci & Engn, Jishou, Hunan, Peoples R China
[4] Hunan Appl Technol Univ, Coll Informat & Engn, Changde 415000, Hunan, Peoples R China
关键词
Harmony Search Algorithm; Low-resource language; Optimization; Part-of-Speech tagging; Unknown words; ALGORITHM;
D O I
10.21123/bsj.2024.9694
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Data -driven models perform poorly on part -of -speech tagging problems with the square Hmong language, a low -resource corpus. This paper designs a weight evaluation function to reduce the influence of unknown words. It proposes an improved harmony search algorithm utilizing the roulette and local evaluation strategies for handling the square Hmong part -of -speech tagging problem. The experiment shows that the average accuracy of the proposed model is 6%, 8% more than HMM and BiLSTM-CRF models, respectively. Meanwhile, the average F1 of the proposed model is also 6%, 3% more than HMM and BiLSTM-CRF models, respectively.
引用
收藏
页码:622 / 632
页数:11
相关论文
共 50 条
  • [31] Choosing a Spanish Part-of-Speech tagger for a lexically sensitive task
    Escartin, Carla Parra
    Alonso, Hector Martinez
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (54): : 29 - 36
  • [32] Detecting Syntactic Change Using a Neural Part-of-Speech Tagger
    Merrill, William
    Stark, Gigi Felice
    Frank, Robert
    1ST INTERNATIONAL WORKSHOP ON COMPUTATIONAL APPROACHES TO HISTORICAL LANGUAGE CHANGE, 2019, : 167 - 174
  • [33] Enriching the knowledge sources used in a maximum entropy part-of-speech tagger
    Toutanova, K
    Manning, CD
    PROCEEDINGS OF THE 2000 JOINT SIGDAT CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND VERY LARGE CORPORA, 2000, : 63 - 70
  • [34] SoMeWeTa: A Part-of-Speech Tagger for German Social Media and Web Texts
    Proisl, Thomas
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 665 - 670
  • [35] A Character-Based Part-of-Speech Tagger with Feedforward Neural Networks
    Kolesau, Aliaksei
    Sesok, Dmitrij
    Rybokas, Mindaugas
    ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY, 2018, 21 (04): : 446 - 459
  • [36] Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM
    Rajan, Rajeev
    Joseph, Anna J.
    Robin, Elizabeth K.
    Nishma, Fathima T. K.
    PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 22 - 27
  • [37] Deep Belief Network Based Part-of-Speech Tagger for Telugu Language
    Jagadeesh, M.
    Kumar, M. Anand
    Soman, K. P.
    PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 3, 2016, 381 : 75 - 84
  • [38] Designing HMM-based part-of-speech tagger for Lithuanian language
    Pajarskaite, G
    Griciute, V
    Raskinis, G
    Kuper, J
    INFORMATICA, 2004, 15 (02) : 231 - 242
  • [39] An auxiliary Part-of-Speech tagger for blog and microblog cyber-slang
    Golia, Silvia
    Zola, Paola
    STATISTICAL ANALYSIS AND DATA MINING, 2023, 16 (01) : 65 - 79
  • [40] Part-of-Speech Tagger for Biomedical Domain Using Deep Neural Network Architecture
    Gopalakrishnan, Athira
    Soman, K. P.
    Premjith, B.
    2019 10TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2019,