Implementation of Automated Bengali Parts of Speech Tagger: An Approach Using Deep Learning Algorithm

被引:0
|
作者
Patoary, Asraf Hossain [1 ]
Bin Kibria, Md Jahid [2 ]
Kaium, Abdul [1 ]
机构
[1] Dhaka Int Univ, Dept CSE, Dhaka, Bangladesh
[2] Univ Dev Alternat, Dept CSE, Dhaka, Bangladesh
关键词
Bengali Parts-of-Speech (POS) Tagger; Deep Learning; Natural Language Processing (NLP); !text type='python']python[!/text] package; BNLTK;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Parts-of-Speech(POS) tagging is the technique to assign each word in a sentence as an individual part of speech. POS tagging is the first important step in Natural Language Processing applications (NLP). In some languages, POS tagging works well with higher accuracy, but in the Bengali language, it is still an unsolved problem. The Bengali language is much ambiguous and inflectional, where every word has many more variants based on their suffixes and prefixes. Although developing POS tagging is not new for the Bengali language, we aim to make a highly accurate model with a minimal dataset. Here we developed a deep learning model, and it is mainly based on suffixes, which are parts of Bengali grammar. Moreover, we did experiment with a Bengali corpus that contains 2927 words with their corresponding parts of speech tags. The accuracy of our proposed POS tagging deep learning model is 93.90%. We also included this model as a python package to our open-source Bengali Natural language processing toolkit (BNLTK), which is now live on pipy.org.
引用
收藏
页码:308 / 311
页数:4
相关论文
共 50 条
  • [1] Deep Learning Based Parts of Speech Tagger for Bengali
    Kabir, Md. Fasihul
    Abdullah-Al-Mamun, Khandaker
    Hudat, Mohammad Nurul
    [J]. 2016 5TH INTERNATIONAL CONFERENCE ON INFORMATICS, ELECTRONICS AND VISION (ICIEV), 2016, : 26 - 29
  • [2] Deep Learning based Tamil Parts of Speech (POS) Tagger
    Anbukkarasi, S.
    Varadhaganapathy, S.
    [J]. BULLETIN OF THE POLISH ACADEMY OF SCIENCES-TECHNICAL SCIENCES, 2021, 69 (06)
  • [3] AsPOS: Assamese Part of Speech Tagger using Deep Learning Approach
    Pathak, Dhrubajyoti
    Nandi, Sukumar
    Sarmah, Priyankoo
    [J]. 2022 IEEE/ACS 19TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2022,
  • [4] A Comprehensive Parts of Speech Tagger for Automatically Checked Valid Bengali Sentences
    Hossain, Nahid
    Huda, Mohammad Nurul
    [J]. 2018 21ST INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2018,
  • [5] Automated English Speech Recognition Using Dimensionality Reduction with Deep Learning Approach
    Yu, Jing
    Ye, Nianhua
    Du, Xueqin
    Han, Lu
    [J]. WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2022, 2022
  • [6] A Machine Learning Approach for Emotion Classification in Bengali Speech
    Islam, Md. Rakibul
    Akhi, Amatul Bushra
    Akter, Farzana
    Rashid, Md Wasiul
    Rumu, Ambia Islam
    Lata, Munira Akter
    Ashrafuzzaman, Md.
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2023, 14 (10) : 885 - 892
  • [7] Recognition of English speech - using a deep learning algorithm
    Wang, Shuyan
    [J]. JOURNAL OF INTELLIGENT SYSTEMS, 2023, 32 (01)
  • [8] Bengali Parts-of-Speech Tagging using Global Linear Model
    Mukherjee, Sankar
    Das Mandal, Shyamal Kumar
    [J]. 2013 ANNUAL IEEE INDIA CONFERENCE (INDICON), 2013,
  • [9] Automated Sign to Speech Conversion Model using Deep Learning
    Ghadekar, Premanand
    Gupta, Aryan Kumar
    Anand, Divsehaj Singh
    Sharma, Dheeraj
    Oswal, Preeti
    Khare, Shreyas
    [J]. INTERNATIONAL JOURNAL OF NEXT-GENERATION COMPUTING, 2021, 12 (05): : 527 - 533
  • [10] Parts-of-Speech tagging for Malayalam using deep learning techniques
    Akhil K.K.
    Rajimol R.
    Anoop V.S.
    [J]. International Journal of Information Technology, 2020, 12 (3) : 741 - 748