Arabic Named Entity Recognition: A Feature-Driven Study

被引:38
|
作者
Benajiba, Yassine [1 ,2 ]
Diab, Mona [3 ]
Rosso, Paolo [1 ,2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, Valencia 46022, Spain
[2] Univ Politecn Valencia, Nat Language Engn Lab, Valencia 46022, Spain
[3] Columbia Univ, Ctr Computat Learning Syst, New York, NY 10115 USA
关键词
Arabic; machine learning comparison; named entity recognition; natural language processing (NLP); MAXIMUM-ENTROPY;
D O I
10.1109/TASL.2009.2019927
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Named Entity Recognition task aims at identifying and classifying Named Entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many Natural Language Processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of Named Entity Recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact of the different features in isolation and incrementally combine them in order to evaluate the robustness to noise of each approach. We achieve the highest performance using a combination of 15 features in conditional random fields using Broadcast News data (F(beta=1) = 83.34).
引用
收藏
页码:926 / 934
页数:9
相关论文
共 50 条
  • [1] Advanced Feature-Driven Disease Named Entity Recognition Using Conditional Random Fields
    Rahman, Hidayat
    Hahn, Thomas
    Segall, Richard
    PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2016, : 469 - 469
  • [2] Hybrid Feature Selection Approach for Arabic Named Entity Recognition
    Shahine, Miran
    Sakre, Mohamed
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 452 - 464
  • [3] Arabic Named Entity Recognition
    Benajiba, Yassine
    PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 151 - 152
  • [4] A Contribution to Arabic Named Entity Recognition
    Koulali, Rim
    Meziane, Abdelouafi
    2012 TENTH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING, 2012, : 46 - 52
  • [5] NERA: Named Entity Recognition for Arabic
    Shaalan, Khaled
    Raza, Hafsa
    JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (08): : 1652 - 1663
  • [6] A New Approach for Arabic Named Entity Recognition
    Karaa, Wahiba
    Slimani, Thabet
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 332 - 338
  • [7] A Survey of Arabic Named Entity Recognition and Classification
    Shaalan, Khaled
    COMPUTATIONAL LINGUISTICS, 2014, 40 (02) : 469 - 510
  • [8] Named entity recognition and classification for text in arabic
    Abuleil, S
    Evens, M
    INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 89 - 94
  • [9] RENA: A Named Entity Recognition System for Arabic
    El Bazi, Ismail
    Laachfoubi, Nabil
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 396 - 404
  • [10] Arabic named entity recognition in crime documents
    Asharef, M.
    Omar, N.
    Albared, M.
    Journal of Theoretical and Applied Information Technology, 2012, 44 (01) : 1 - 6