Supervised Named Entity Recognition in Assamese language

被引:0
|
作者
Talukdar, Gitimoni [1 ]
Borah, Pranjal Protim [1 ]
Baruah, Arup [1 ]
机构
[1] Assam Don Bosco Univ, Dept Comp Sci & Engn & IT, Gauhati, India
关键词
Named Entity Recognition; Corpus; Naive Bayes Classifier; Morphology; Suffix stripping;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In each and every natural language nouns play a very important role. A subcategory of noun is proper noun. They represent the names of person, location, organization etc. The task of recognizing the proper nouns in a text and categorizing them into some classes such as person, location, organization and other is called Named Entity Recognition. This is a very essential step of many natural language processing applications that makes the process of information extraction easier. Named Entity Recognition (NER) in most of the Indian languages has been performed using rule-based, supervised and unsupervised approaches. In this work our target language is Assamese, the language spoken by most of the people in North-Eastern part of India and particularly in Assam. In Assamese language, Named Entity Recognition has been performed using the rule based and suffix stripping based approaches. Supervised learning technique is more useful and can be easily adapted to new domains compared to rule based approaches. This paper reports the first work in Assamese NER using a machine learning technique. In this paper Assamese Named Entity Recognition is performed using Naive Bayes classifier. Since feature extraction plays the most important role in getting better performance in any machine learning technique, in this work our aim is to put forward a description of a few important features related to Assamese NER and performance measure of the system using these features.
引用
收藏
页码:187 / 191
页数:5
相关论文
共 50 条
  • [11] Analysis of Different Supervised Techniques for Named Entity Recognition
    Goyal, Archana
    Gupta, Vishal
    Kumar, Manish
    ADVANCED INFORMATICS FOR COMPUTING RESEARCH, PT I, 2019, 1075 : 184 - 195
  • [12] Named Entity Recognition System for Sindhi Language
    Jumani, Awais Khan
    Memon, Mashooque Ahmed
    Khoso, Fida Hussain
    Sanjrani, Anwar Ali
    Soomro, Safeeullah
    EMERGING TECHNOLOGIES IN COMPUTING, ICETIC 2018, 2018, 200 : 237 - 246
  • [13] A Named Entity Recognition System for the Marathi Language
    Vaishali, P. Kadam
    Mahender, Namrata
    JOURNAL OF ADVANCED APPLIED SCIENTIFIC RESEARCH, 2024, 6 (03): : 229 - 243
  • [14] A LANGUAGE INDEPENDENT NAMED ENTITY RECOGNITION SYSTEM
    Gifu, Daniela
    Vasilache, Gabriela
    PROCEEDINGS OF THE 10TH INTERNATIONAL CONFERENCE 'LINQUISTIC RESOURCES AND TOOLS FOR PROCESSING THE ROMANIAN LANGUAGE', 2014, 2014, : 181 - 188
  • [15] Named Entity Recognition and Classification for Gujarati Language
    Vora, Komil
    Vasant, Avani
    Adhvaryu, Rachit
    2016 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2016, : 2269 - 2272
  • [16] Named entity recognition for Hindi language : A survey
    Sharma, Richa
    Morwal, Sudha
    Agarwal, Basant
    JOURNAL OF DISCRETE MATHEMATICAL SCIENCES & CRYPTOGRAPHY, 2019, 22 (04): : 569 - 580
  • [17] Language Clustering for Multilingual Named Entity Recognition
    Shaffer, Kyle
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2021, 2021, : 40 - 45
  • [18] FEATURES FOR NAMED ENTITY RECOGNITION IN CZECH LANGUAGE
    Kral, Pavel
    KEOD 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON KNOWLEDGE ENGINEERING AND ONTOLOGY DEVELOPMENT, 2011, : 437 - 441
  • [19] Named Entity Recognition: a Survey for the Portuguese Language
    Albuquerque, Hidelberg O.
    Souza, Ellen
    Gomes, Carlos
    Pinto, Matheus Henrique de C.
    Filho, Ricardo P. S.
    Costa, Rosimeire
    Lopes, Vinicius Teixeira de M.
    da Silva, Nadia F. F.
    de Carvalho, Andre C. P. L. F.
    Oliveira, Adriano L. I.
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2023, (70): : 171 - 185
  • [20] ENTITY RECOGNITION IN ASSAMESE TEXT
    Mahanta, Nandana
    Dhar, Sourish
    Roy, Sudipta
    PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND ELECTRONICS SYSTEMS (ICCES), 2016, : 522 - 526