Named Entity Recognition Using Conditional Random Fields

被引:6
|
作者
Khan, Wahab [1 ,2 ]
Daud, Ali [3 ]
Shahzad, Khurram [4 ]
Amjad, Tehmina [2 ]
Banjar, Ameen [3 ]
Fasihuddin, Heba [3 ]
机构
[1] Univ Sci & Technol, Dept Comp Sci, Bannu 28100, Pakistan
[2] Int Islamic Univ Islamabad, Dept Comp Sci & Software Engn, Islamabad 44000, Pakistan
[3] Univ Jeddah, Coll Comp Sci & Engn, Dept Informat Syst & Technol, Jeddah 21959, Saudi Arabia
[4] Univ Punjab, Dept Data Sci, Lahore 54000, Pakistan
来源
APPLIED SCIENCES-BASEL | 2022年 / 12卷 / 13期
关键词
natural language processing; information filtering; information extraction; machine learning; classification algorithms; named entity recognition; NETWORKS;
D O I
10.3390/app12136391
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Named entity recognition (NER) is an important task in natural language processing, as it is widely featured as a key information extraction sub-task with numerous application areas. A plethora of attempts was made for NER detection in Western and Asian languages. However, little effort has been made to develop techniques for the Urdu language, which is a prominent South Asian language with hundreds of millions of speakers across the globe. NER in Urdu is considered a hard problem owing to several reasons, including the paucity of large, annotated datasets; an inaccurate tokenizer; and the absence of capitalization in the Urdu language. To this end, this study proposed a conditional-random-field-based technique with both language-dependent and language-independent features, such as part-of-speech tags and context windows of words, respectively. As a second contribution, we developed an Urdu NER dataset (UNER-I) in which a large number of NE types were manually annotated. To evaluate the effectiveness of the proposed approach, as well as the usefulness of the dataset, experiments were performed using the dataset we developed and an existing dataset. The results of the experiments showed that our proposed technique outperformed the baseline technique for both datasets by improving the F1 scores by 1.5% to 3%. Furthermore, the results demonstrated that the enhanced dataset was useful for learning and prediction in a supervised learning approach.
引用
收藏
页数:18
相关论文
共 50 条
  • [1] Named Entity Recognition using Conditional Random Fields
    Patil, Nita
    Patil, Ajay
    Pawar, B., V
    [J]. INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND DATA SCIENCE, 2020, 167 : 1181 - 1188
  • [2] A tool for the named entity recognition using conditional random fields
    do Amaral, Daniela Oliveira F.
    Vieira, Renata
    [J]. LINGUAMATICA, 2014, 6 (01): : 41 - 49
  • [3] A Malay Named Entity Recognition Using Conditional Random Fields
    Salleh, Muhammad Sharilazlan
    Asmai, Siti Azirah
    Basiron, Halizah
    Ahmad, Sabrina
    [J]. 2017 5TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY (ICOIC7), 2017,
  • [4] Named entity recognition based on conditional random fields
    Song, Shengli
    Zhang, Nan
    Huang, Haitao
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2019, 22 (Suppl 3): : S5195 - S5206
  • [5] Iterative Named Entity Recognition with Conditional Random Fields
    Alves-Pinto, Ana
    Demus, Christoph
    Spranger, Michael
    Labudde, Dirk
    Hobley, Eleanor
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [6] Named entity recognition based on conditional random fields
    Shengli Song
    Nan Zhang
    Haitao Huang
    [J]. Cluster Computing, 2019, 22 : 5195 - 5206
  • [7] Kannada Named Entity Recognition and classification using Conditional Random Fields
    Amarappa, S.
    Sathyanarayana, S. V.
    [J]. 2015 INTERNATIONAL CONFERENCE ON EMERGING RESEARCH IN ELECTRONICS, COMPUTER SCIENCE AND TECHNOLOGY (ICERECT), 2015, : 186 - 191
  • [8] Recognition of bacteria named entity using conditional random fields in Spark
    Wang, Xiaoyan
    Li, Yichuan
    He, Tingting
    Jiang, Xingpeng
    Hu, Xiaohua
    [J]. BMC SYSTEMS BIOLOGY, 2018, 12
  • [9] BIOMEDICAL NAMED ENTITY RECOGNITION USING SECONDORDER CONDITIONAL RANDOM FIELDS
    Thipcharoen, Supattanawaree
    Subpaiboonkit, Sitthichoke
    Chaijaruwanich, Jeerayut
    [J]. 2011 3RD INTERNATIONAL CONFERENCE ON COMPUTER TECHNOLOGY AND DEVELOPMENT (ICCTD 2011), VOL 2, 2012, : 397 - 401
  • [10] Hadoop Recognition of Biomedical Named Entity Using Conditional Random Fields
    Li, Kenli
    Ai, Wei
    Tang, Zhuo
    Zhang, Fan
    Jiang, Lingang
    Li, Keqin
    Hwang, Kai
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2015, 26 (11) : 3040 - 3051