Arabic Named Entity Recognition: A Feature-Driven Study

被引:38
|
作者
Benajiba, Yassine [1 ,2 ]
Diab, Mona [3 ]
Rosso, Paolo [1 ,2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, Valencia 46022, Spain
[2] Univ Politecn Valencia, Nat Language Engn Lab, Valencia 46022, Spain
[3] Columbia Univ, Ctr Computat Learning Syst, New York, NY 10115 USA
关键词
Arabic; machine learning comparison; named entity recognition; natural language processing (NLP); MAXIMUM-ENTROPY;
D O I
10.1109/TASL.2009.2019927
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Named Entity Recognition task aims at identifying and classifying Named Entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many Natural Language Processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of Named Entity Recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact of the different features in isolation and incrementally combine them in order to evaluate the robustness to noise of each approach. We achieve the highest performance using a combination of 15 features in conditional random fields using Broadcast News data (F(beta=1) = 83.34).
引用
收藏
页码:926 / 934
页数:9
相关论文
共 50 条
  • [31] Arabic Named Entity Recognition: A BERT-BGRU Approach
    Alsaaran, Norah
    Alrabiah, Maha
    CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 68 (01): : 471 - 485
  • [32] Bidirectional Encoder–Decoder Model for Arabic Named Entity Recognition
    Mohammed N. A. Ali
    Guanzheng Tan
    Arabian Journal for Science and Engineering, 2019, 44 : 9693 - 9701
  • [33] Improving Arabic Named Entity Recognition by Global Features and Triggers
    AlGahtani, Shabib
    McNaught, John
    KNOWLEDGE MANAGEMENT AND INNOVATION IN ADVANCING ECONOMIES-ANALYSES & SOLUTIONS, VOLS 1-3, 2009, : 1554 - 1560
  • [34] Building the Classical Arabic Named Entity Recognition Corpus (CANERCorpus)
    Salah, Ramzi Esmail
    Zakaria, Lailatul Qadri Binti
    2018 FOURTH INTERNATIONAL CONFERENCE ON INFORMATION RETRIEVAL AND KNOWLEDGE MANAGEMENT (CAMP), 2018, : 150 - 157
  • [35] A recent survey of Arabic named entity recognition on social media
    Ali B.A.B.
    Mihi S.
    Bazi I.E.
    Laachfoubi N.
    Revue d'Intelligence Artificielle, 2020, 34 (02) : 125 - 135
  • [36] Integrating Semantic Features for Enhancing Arabic Named Entity Recognition
    Alsayadi, Hamzah A.
    ElKorany, Abeer M.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2016, 7 (03) : 128 - 136
  • [37] Named Entity Recognition Model Based on Feature Fusion
    Sun, Zhen
    Li, Xinfu
    INFORMATION, 2023, 14 (02)
  • [38] A comparative study on feature reduction approaches in Hindi and Bengali named entity recognition
    Saha, Sujan Kumar
    Mitra, Pabitra
    Sarkar, Sudeshna
    KNOWLEDGE-BASED SYSTEMS, 2012, 27 : 322 - 332
  • [39] Arabic Named Entity Recognition on Social Media based on feature selection techniques using SVM-RFE
    Ali, Brahim Ait Ben
    Mihi, Soukaina
    Bazi, Ismail El
    Laachfoubi, Nahil
    2020 FOURTH INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING IN DATA SCIENCES (ICDS), 2020,
  • [40] Transfer Learning for Arabic Named Entity Recognition With Deep Neural Networks
    Al-Smadi, Mohammad
    Al-Zboon, Saad
    Jararweh, Yaser
    Juola, Patrick
    IEEE ACCESS, 2020, 8 : 37736 - 37745