An Efficient Part-of-Speech Tagger for Arabic

被引:0
|
作者
Kopru, Selcuk [1 ]
机构
[1] Teknol Yazilimevi Ltd, METU Technopolis, TR-06531 Ankara, Turkey
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we present an efficient part-of-speech (POS) tagger for Arabic which is based on a Hidden Markow Model. We explore different enhancements to improve the baseline system. Despite the morphological complexity of Arabic our approach is a data driven approach and does not utilize any morphological analyzer or a lexicon as many other Arabic PUS taggers. This makes our approach simple, very efficient and valuable to be used in real-life applications and the obtained accuracy results are still comparable to other Arabic POS taggers. In the experiments, we also thoroughly investigate different aspects of Arabic PUS tagging including tag sets, prefix and suffix analyses which were not examined in detail before. Our part-of-speech tagger achieves an accuracy of 95.57% on a standard tagset for Arabic. A detailed error analysis is provided for a better evaluation of the system. We also applied the same approach on different languages like Farsi and German to show the language independent aspect of the approach. Accuracy rates on these languages are also provided.
引用
收藏
页码:202 / 213
页数:12
相关论文
共 50 条
  • [21] Part-of-Speech Tagger for Malay Social Media Texts
    Ariffin, Siti Noor Allia Noor
    Tiun, Sabrina
    [J]. GEMA ONLINE JOURNAL OF LANGUAGE STUDIES, 2018, 18 (04): : 124 - 142
  • [22] Adding Morphological Information to a Connectionist Part-Of-Speech Tagger
    Zamora-Martinez, Francisco
    Jose Castro-Bleda, Maria
    Espana-Boquera, Salvador
    Tortajada-Velert, Salvador
    [J]. CURRENT TOPICS IN ARTIFICIAL INTELLIGENCE, 2010, 5988 : 191 - +
  • [23] A morphology-system and part-of-speech tagger for German
    Lezius, W
    Rapp, R
    Wettler, M
    [J]. NATURAL LANGUAGE PROCESSING AND SPEECH TECHNOLOGY: RESULTS OF THE 3RD KONVENS CONFERENCE, 1996, : 369 - 378
  • [24] Part-of-Speech Tagger Based on Maximum Entropy Model
    Huang Heyan
    Zhang Xiaofei
    [J]. 2009 2ND IEEE INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND INFORMATION TECHNOLOGY, VOL 3, 2009, : 26 - 29
  • [25] An efficient part-of-speech tagger rule-based approach of Sanskrit language analysis
    Tapaswi N.
    [J]. International Journal of Information Technology, 2024, 16 (2) : 901 - 908
  • [26] Bayesian reinforcement for a probabilistic neural net Part-of-Speech tagger
    Maragoudakis, M
    Ganchev, T
    Fakotakis, N
    [J]. TEXT, SPEECH AND DIALOGUE, PROCEEDINGS, 2004, 3206 : 137 - 145
  • [27] Development of a multilingual parallel corpus and a part-of-speech tagger for Afrikaans
    Trushkina, Julia
    [J]. Intelligent Information Processing III, 2006, 228 : 453 - 462
  • [28] Building an Indonesian Rule-Based Part-of-Speech Tagger
    Rashel, Fam
    Luthfi, Andry
    Dinakaramani, Arawinda
    Manurung, Ruli
    [J]. PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2014), 2014, : 70 - 73
  • [29] A Supervised Part-Of-Speech Tagger for the Greek Language of the Social Web
    Nikiforos, Maria Nefeli
    Kermanidis, Katia Lida
    [J]. PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), 2020, : 3861 - 3867
  • [30] Choosing a Spanish Part-of-Speech tagger for a lexically sensitive task
    Escartin, Carla Parra
    Alonso, Hector Martinez
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2015, (54): : 29 - 36