ACUT: An Associative Classifier Approach to Unknown Word POS Tagging

被引:0
|
作者
Elahimanesh, Mohammad Hossein [1 ]
Minaei-Bidgoli, Behrouz [2 ]
Kermani, Fateme [1 ]
机构
[1] Comp Res Ctr Islamic Sci, Qom, Iran
[2] Iran Univ Sci & Technol, Tehran, Iran
关键词
Part-of-Speech tagging; Associative classifier; Hidden Markov Model; Unknown words;
D O I
10.1007/978-3-319-10849-0_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The focus of this article is unknown word Part-of-Speech (POS) tagging. POS tagging which is one the fundamental requirements for intelligent text processing based on texts language. Therefore, this article firstly aims to provide a POS tagger with high accuracy for Persian language. The technique which is proposed by this article for handling unknown words is using a combination of a type of associative classifier along with a Hidden Markov Models (HMM) algorithm. Associative classification is a new classification approach integrating association mining and classification. The associative classifier used in this study is a type of associative classifiers that is innovated by this research. This kind of classifier not only uses sequence probability but also uses the CBA classifier. CBA first generates all the association rules with certain support and confidence thresholds as candidate rules. It then selects a small set of rules from them to form a classifier. When predicting the class label for an example, the best rule whose body is satisfied by the example is chosen for prediction. Based on the experimental results, the proposed algorithm can increase the accuracy of Persian unknown word POS tagging to 81.8 %. The total accuracy of proposed tagger is 98 % and its sentence accuracy is 63.1 %.
引用
收藏
页码:250 / +
页数:3
相关论文
共 50 条
  • [1] Unknown word processing in HMM-based POS tagging
    Zhang, Xiaofei
    Huang, Heyan
    Zhang, Daoyang
    [J]. RECENT ADVANCE OF CHINESE COMPUTING TECHNOLOGIES, 2007, : 110 - 113
  • [2] BosonNLP: An Ensemble Approach for Word Segmentation and POS Tagging
    Min, Kerui
    Ma, Chenggang
    Zhao, Tianmei
    Li, Haiyan
    [J]. NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2015, 2015, 9362 : 520 - 526
  • [3] Hybrid approach for Khmer unknown word POS guessing
    Nou, Chenda
    Kameyama, Wataru
    [J]. IRI 2007: PROCEEDINGS OF THE 2007 IEEE INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION, 2007, : 215 - +
  • [4] A Semisupervised Associative Classification Method for POS Tagging
    Rani, Pratibha
    Pudi, Vikram
    Sharma, Dipti Misra
    [J]. 2014 INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA), 2014, : 156 - 162
  • [5] Chinese Word POS Tagging with Markov Logic
    Liao, Zhihua
    Zeng, Qixian
    Wang, Qiyun
    [J]. INTELLIGENCE AND SECURITY INFORMATICS, PAISI 2015, 2015, 9074 : 91 - 101
  • [6] Unknown Words Analysis in POS tagging of Sinhala Language
    Jayaweera, A. J. P. M. P.
    Dias, N. G. J.
    [J]. 14TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) 2014, 2014, : 270 - 270
  • [7] A semi-supervised associative classification method for POS tagging
    Rani P.
    Pudi V.
    Sharma D.M.
    [J]. International Journal of Data Science and Analytics, 2016, 1 (2) : 123 - 136
  • [8] Word segmentation and POS tagging for Chinese keyphrase extraction
    Huang, XC
    Chen, J
    Yan, PL
    Luo, X
    [J]. ADVANCED DATA MINING AND APPLICATIONS, PROCEEDINGS, 2005, 3584 : 364 - 369
  • [9] A machine learning approach to POS tagging
    Màrquez, L
    Padró, L
    Rodríguez, H
    [J]. MACHINE LEARNING, 2000, 39 (01) : 59 - 91
  • [10] Chinese unknown word identification as known word tagging
    Fu, GH
    Luke, KK
    [J]. PROCEEDINGS OF THE 2004 INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND CYBERNETICS, VOLS 1-7, 2004, : 2612 - 2617