Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM

被引:0
|
作者
Rajan, Rajeev [1 ]
Joseph, Anna J. [1 ]
Robin, Elizabeth K. [1 ]
Nishma, Fathima T. K. [1 ]
机构
[1] Coll Engn, Trivandrum, Kerala, India
来源
PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020) | 2020年
关键词
POS tagging; Malayalam; NLP; Decision tree; BLSTM; Stochastic process;
D O I
10.1109/o-cocosda50338.2020.9295018
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The majority of activities performed by humans are done through language, whether communicated directly or reported using natural language. As technology is increasingly making the methods and platforms on which we communicate ever more accessible, there is a great need to understand the languages we use to communicate. By combining the power of artificial intelligence, computational linguistics and computer science, natural language processing (NLP) helps machines read text by simulating the human ability to understand language. Part-of-speech tagging (POS Tagging) is done as a pre-requisite to simplify a lot of different NLP applications like question answering, speech recognition, machine translation, and so on. Here, we attempt a comparison between part-of-speech taggers in Malayalam using decision tree algorithm and bi-directional long short term memory (BLSTM). The experiments presented in this paper use two corpora, one of 29076 sentences and the other of 500 sentences for performance evaluation. The experiments demonstrate the potential of architectural choice of BLSTM-based tagger over conventional decision tree-based tagging in Malayalam.
引用
收藏
页码:22 / 27
页数:6
相关论文
共 50 条
  • [1] Bi-directional lstm network speech-to-gesture generation using bi-directional lstm network
    Kaneko N.
    Takeuchi K.
    Hasegawa D.
    Shirakawa S.
    Sakuta H.
    Sumi K.
    Transactions of the Japanese Society for Artificial Intelligence, 2019, 34 (06):
  • [2] Hybrid Part of Speech Tagger for Malayalam
    Francis, Merin
    Nair, K. N. Ramachandran
    2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 1744 - 1750
  • [3] Part-of-speech Tagger for Assamese Using Ensembling Approach
    Pathak, Dhrubajyoti
    Nandi, Sukumar
    Sarmah, Priyankoo
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (10)
  • [4] Implementing an efficient part-of-speech tagger
    Carlberger, J
    Kann, V
    SOFTWARE-PRACTICE & EXPERIENCE, 1999, 29 (09): : 815 - 832
  • [5] An Accurate Persian Part-of-Speech Tagger
    Okhovvat, Morteza
    Sharifi, Mohsen
    Bidgoli, Behrouz Minaei
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 423 - 430
  • [6] A Practical Part-of-Speech Tagger for Bengali
    Sarkar, Kamal
    Gayen, Vivekananda
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 36 - 40
  • [7] An Efficient Part-of-Speech Tagger for Arabic
    Kopru, Selcuk
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 202 - 213
  • [8] TnT - A statistical part-of-speech tagger
    Brants, T
    6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, 2000, : 224 - 231
  • [9] Tamil Part-of-Speech tagger based on SVMTool
    Dhanalakshmi, V
    Anandkumar, M.
    Vijaya, M. S.
    Loganathan, R.
    Soman, K. P.
    Rajendran, S.
    RECENT ADVANCES OF ASIAN LANGUAGE PROCESSING TECHNOLOGIES, 2008, : 59 - +
  • [10] Toward an Effective Igbo Part-of-Speech Tagger
    Onyenwe, Ikechukwu E.
    Hepple, Mark
    Chinedu, Uchechukwu
    Ezeani, Ignatius
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2019, 18 (04)