Hybrid Part of Speech Tagger for Malayalam

被引:0
|
作者
Francis, Merin
Nair, K. N. Ramachandran
机构
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The process of assigning part of speech for every word in a given sentence according to the context is called as part of speech tagging. Part of speech tagging (POS tagging) plays an important role in the area of natural language processing (NLP) including applications such as speech recognition, speech synthesis, natural language parsing, information retrieval, multi words term extraction, word sense disambiguation and machine translation. This paper proposes an efficient and accurate POS tagging technique for Malayalam language using hybrid approach. We propose a Conditional Random Fields(CRF) based method integrated with Rule-Based method. We use SVM based method to compare the accuracy. The corpus both tagged and untagged used for training and testing the system is in the unicode format. The tagset developed by IIIT Hyderabad for Indian Languages is used. The system is tested for selected books of Bible and perform with an accuracy of 94%.
引用
收藏
页码:1744 / 1750
页数:7
相关论文
共 50 条
  • [1] A Hybrid Parts Of Speech Tagger for Malayalam Language
    Aziz, Anisha T.
    Sunitha, C.
    [J]. 2015 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2015, : 1502 - 1507
  • [2] Part-Of-Speech Tagger in Malayalam Using Bi-directional LSTM
    Rajan, Rajeev
    Joseph, Anna J.
    Robin, Elizabeth K.
    Nishma, Fathima T. K.
    [J]. PROCEEDINGS OF 2020 23RD CONFERENCE OF THE ORIENTAL COCOSDA INTERNATIONAL COMMITTEE FOR THE CO-ORDINATION AND STANDARDISATION OF SPEECH DATABASES AND ASSESSMENT TECHNIQUES (ORIENTAL-COCOSDA 2020), 2020, : 22 - 27
  • [3] Hybrid Part of Speech Tagger for Sinhala Language
    Gunasekara, Dilmi
    Welgama, W. V.
    Weerasinghe, A. R.
    [J]. 2016 SIXTEENTH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) - 2016, 2016, : 41 - 48
  • [4] On Part of Speech Tagger for Indonesian Language
    Yuwana, R. Sandra
    Yuliani, Asri R.
    Pardede, Hilman F.
    [J]. 2017 2ND INTERNATIONAL CONFERENCES ON INFORMATION TECHNOLOGY, INFORMATION SYSTEMS AND ELECTRICAL ENGINEERING (ICITISEE): OPPORTUNITIES AND CHALLENGES ON BIG DATA FUTURE INNOVATION, 2017, : 369 - 372
  • [5] Implementing an efficient part-of-speech tagger
    Carlberger, J
    Kann, V
    [J]. SOFTWARE-PRACTICE & EXPERIENCE, 1999, 29 (09): : 815 - 832
  • [6] Building a Part of Speech tagger for the Tamil Language
    Sarveswaran, Kengatharaiyer
    Dias, Gihan
    [J]. 2021 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP), 2021, : 286 - 291
  • [7] An Accurate Persian Part-of-Speech Tagger
    Okhovvat, Morteza
    Sharifi, Mohsen
    Bidgoli, Behrouz Minaei
    [J]. COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 2020, 35 (06): : 423 - 430
  • [8] A Practical Part-of-Speech Tagger for Bengali
    Sarkar, Kamal
    Gayen, Vivekananda
    [J]. 2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 36 - 40
  • [9] Learning a Stochastic Part of Speech Tagger for Sinhala
    Jayasuriya, M.
    Weerasinghe, A. R.
    [J]. 2013 INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER), 2013, : 137 - 143
  • [10] An Efficient Part-of-Speech Tagger for Arabic
    Kopru, Selcuk
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, PT I, 2011, 6608 : 202 - 213