Unknown word processing in HMM-based POS tagging

被引:0
|
作者
Zhang, Xiaofei [1 ]
Huang, Heyan [1 ]
Zhang, Daoyang
机构
[1] Chinese Acad Sci, Res Ctr Comp & Language Informat Engn, Beijing 100097, Peoples R China
关键词
part-of-speech tagging; Hidden Markov model; corpus;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ambiguity of POS (part of speech) is a very important ambiguous phenomenon in natural language processing, which needs to be resolved urgently. Especially, it is a very difficult task to disambiguate the POS ambiguity of the unknown words. In this paper, through converting the job of POS tagging the unknown words to computation of the emission probability of the unknown words, a new POS tagging approach based on HMM (Hidden Markov Model) is proposed. This method preferably solved the challenge of processing unknown words in POS tagging. Compared with other HMM-based POS tagging approach, our approach improved the average accuracy of POS tagging by 1%, and gained an accuracy of 97% in close test and an accuracy of 95% in open test. The positive test result confirmed the method's validity.
引用
收藏
页码:110 / 113
页数:4
相关论文
共 50 条
  • [1] Bidirectional HMM-based Arabic POS tagging
    Kadim, Ayoub
    Lazrek, Azzeddine
    [J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2016, 19 (02) : 303 - 312
  • [2] Enhancing HMM-based POS tagger for Mizo language
    Nunsanga, Morrel V. L.
    Pakray, Partha
    Devi, Toijam Sonalika
    Singh, L. Lolit Kr
    [J]. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (06) : 11725 - 11736
  • [3] ACUT: An Associative Classifier Approach to Unknown Word POS Tagging
    Elahimanesh, Mohammad Hossein
    Minaei-Bidgoli, Behrouz
    Kermani, Fateme
    [J]. ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING, AISP 2013, 2014, 427 : 250 - +
  • [4] Parallel HMM-Based Approach for Arabic Part of Speech Tagging
    Kadim, Ayoub
    Lazrek, Azzeddine
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2018, 15 (02) : 341 - 351
  • [5] Extensions to HMM-based statistical word alignment models
    Toutanova, K
    Ilhan, HT
    Manning, CD
    [J]. PROCEEDINGS OF THE 2002 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, 2002, : 87 - 94
  • [6] Using Synthetic Clinical Data to Train an HMM-Based POS Tagger
    Knoll, Benjamin C.
    Melton, Genevieve B.
    Liu, Hongfang
    Xu, Hua
    Pakhomov, Serguei V. S.
    [J]. 2016 3RD IEEE EMBS INTERNATIONAL CONFERENCE ON BIOMEDICAL AND HEALTH INFORMATICS, 2016, : 252 - 255
  • [7] IMPROVEMENTS IN HMM-BASED ISOLATED WORD RECOGNITION SYSTEM
    PEINADO, AM
    LOPEZ, JM
    SANCHEZ, VE
    SEGURA, JC
    AYUSO, AJR
    [J]. IEE PROCEEDINGS-I COMMUNICATIONS SPEECH AND VISION, 1991, 138 (03): : 201 - 206
  • [8] Scalable architecture for word HMM-based speech recognition
    Yoshizawa, S
    Wada, N
    Hayasaka, N
    Miyanaga, Y
    [J]. 2004 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, VOL 3, PROCEEDINGS, 2004, : 417 - 420
  • [9] The advantage of using an HMM-based approach for faxed word recognition
    Elms A.J.
    Procter S.
    Illingworth J.
    [J]. International Journal on Document Analysis and Recognition, 1998, 1 (1) : 18 - 36
  • [10] An HMM-based subband processing approach to speaker identification
    Higgins, JE
    Damper, RI
    [J]. AUDIO- AND VIDEO-BASED BIOMETRIC PERSON AUTHENTICATION, PROCEEDINGS, 2001, 2091 : 169 - 174