Improving part-of-speech tagging in Amharic language using deep neural network

被引:1
|
作者
Hirpassa, Sintayehu [1 ]
Lehal, G. S. [2 ]
机构
[1] Adama Sci & Technol Univ, Dept Comp Sci, Adama, Ethiopia
[2] Punjabi Univ, Dept Comp Sci, Patiala, India
关键词
Natural language processing; Conditional random fields; Recurrent neural network; Long short-term memory; Feature representation;
D O I
10.1016/j.heliyon.2023.e17175
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
To date, several POS taggers have been introduced to facilitate the success of semantic analysis for different languages. However, the task of POS tagging becomes a bit intricate in morphologically complex languages, like Amharic. In this paper, we evaluated different models such as bidirectional long short term memory, convolutional neural network in combination with bidirectional long short term memory, and conditional random field for Amharic POS tagging. Various features, both language-dependent and -independent, have been explored in a conditional random field model. Besides, word-level and character-level features are analyzed in deep neural network models. A convolutional neural network is utilized for encoding features at the word and character level. Each model's performance has evaluated on the dataset that contained 321 K tokens and manually tagged with 31 POS tags. Lastly, the best performance obtained by an end-to-end deep neural network model, convolutional neural network in combination with bidirectional long term short memory and conditional random field, is 97.23% accuracy. This is the highest accuracy for Amharic POS tagging task and is competent with contemporary taggers currently existing in different languages.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Deep Neural Network Architecture for Part-of-Speech Tagging for Turkish Language
    Bahcevan, Cenk Anil
    Kutlu, Emirhan
    Yildiz, Tugba
    [J]. 2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2018, : 235 - 238
  • [2] Time Series Neural Network Model for Part-of-Speech Tagging Indonesian Language
    Tanadi, Theo
    [J]. INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND DIGITAL APPLICATIONS (ICITDA 2017), 2018, 325
  • [3] Part-of-speech tagging with simple recurrent neural network
    [J]. Jisuanji Yanjiu yu Fazhan, 6 (421-426):
  • [4] The Transformer Neural Network Architecture for Part-of-Speech Tagging
    Maksutov, Artem A.
    Zamyatovskiy, Vladimir, I
    Morozov, Viacheslav O.
    Dmitriev, Sviatoslav O.
    [J]. PROCEEDINGS OF THE 2021 IEEE CONFERENCE OF RUSSIAN YOUNG RESEARCHERS IN ELECTRICAL AND ELECTRONIC ENGINEERING (ELCONRUS), 2021, : 536 - 540
  • [5] A Deep Learning Approach for Part-of-Speech Tagging in Nepali Language
    Prabha, Greeshma
    Jyothsna, P., V
    Shahina, K. K.
    Premjith, B.
    Soman, K. P.
    [J]. 2018 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2018, : 1132 - 1136
  • [6] Part-of-Speech Tagging for Azerbaijani Language
    Mammadov, Samir
    Rustamov, Samir
    Mustafali, Ali
    Sadigov, Ziyaddin
    Mollayev, Rasim
    Mammadov, Zamir
    [J]. 2018 IEEE 12TH INTERNATIONAL CONFERENCE ON APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES (AICT), 2018, : 40 - 45
  • [7] Jointly Part-of-Speech Tagging and Semantic Role Labeling Using Auxiliary Deep Neural Network Model
    Shen, Yatian
    Mai, Yubo
    Shen, Xiajiong
    Ding, Wenke
    Guo, Mengjiao
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2020, 65 (01): : 529 - 541
  • [8] Punctuation Prediction using a Bidirectional Recurrent Neural Network with Part-of-Speech Tagging
    Juin, Chin Char
    Wei, Richard Xiong Jun
    D'Haro, Luis Fernando
    Banchs, Rafael E.
    [J]. TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE, 2017, : 1806 - 1811
  • [9] Part-of-Speech Tagging of Odia Language Using Statistical and Deep Learning Based Approaches
    Dalai, Tusarkanta
    Mishra, Tapas Kumar
    Sa, Pankaj K.
    [J]. ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (06)
  • [10] Part-of-Speech (POS) Tagging for the Nyishi Language
    Siram, Joyir
    Sambyo, Koj
    Sarkar, Achyuth
    [J]. ADVANCES IN INFORMATION COMMUNICATION TECHNOLOGY AND COMPUTING, AICTC 2021, 2022, 392 : 191 - 199