Arabic Named Entity Recognition: A Feature-Driven Study

被引:37
|
作者
Benajiba, Yassine [1 ,2 ]
Diab, Mona [3 ]
Rosso, Paolo [1 ,2 ]
机构
[1] Univ Politecn Valencia, Dept Informat Syst & Computat, Valencia 46022, Spain
[2] Univ Politecn Valencia, Nat Language Engn Lab, Valencia 46022, Spain
[3] Columbia Univ, Ctr Computat Learning Syst, New York, NY 10115 USA
关键词
Arabic; machine learning comparison; named entity recognition; natural language processing (NLP); MAXIMUM-ENTROPY;
D O I
10.1109/TASL.2009.2019927
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The Named Entity Recognition task aims at identifying and classifying Named Entities within an open-domain text. This task has been garnering significant attention recently as it has been shown to help improve the performance of many Natural Language Processing applications. In this paper, we investigate the impact of using different sets of features in three discriminative machine learning frameworks, namely, support vector machines, maximum entropy and conditional random fields for the task of Named Entity Recognition. Our language of interest is Arabic. We explore lexical, contextual and morphological features and nine data-sets of different genres and annotations. We measure the impact of the different features in isolation and incrementally combine them in order to evaluate the robustness to noise of each approach. We achieve the highest performance using a combination of 15 features in conditional random fields using Broadcast News data (F(beta=1) = 83.34).
引用
收藏
页码:926 / 934
页数:9
相关论文
共 50 条
  • [1] Advanced Feature-Driven Disease Named Entity Recognition Using Conditional Random Fields
    Rahman, Hidayat
    Hahn, Thomas
    Segall, Richard
    [J]. PROCEEDINGS OF THE 7TH ACM INTERNATIONAL CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY, AND HEALTH INFORMATICS, 2016, : 469 - 469
  • [2] Hybrid Feature Selection Approach for Arabic Named Entity Recognition
    Shahine, Miran
    Sakre, Mohamed
    [J]. COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I, 2018, 9623 : 452 - 464
  • [3] Arabic Named Entity Recognition
    Benajiba, Yassine
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2010, (44): : 151 - 152
  • [4] A Contribution to Arabic Named Entity Recognition
    Koulali, Rim
    Meziane, Abdelouafi
    [J]. 2012 TENTH INTERNATIONAL CONFERENCE ON ICT AND KNOWLEDGE ENGINEERING, 2012, : 46 - 52
  • [5] NERA: Named Entity Recognition for Arabic
    Shaalan, Khaled
    Raza, Hafsa
    [J]. JOURNAL OF THE AMERICAN SOCIETY FOR INFORMATION SCIENCE AND TECHNOLOGY, 2009, 60 (08): : 1652 - 1663
  • [6] A New Approach for Arabic Named Entity Recognition
    Karaa, Wahiba
    Slimani, Thabet
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 332 - 338
  • [7] RENA: A Named Entity Recognition System for Arabic
    El Bazi, Ismail
    Laachfoubi, Nabil
    [J]. TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 396 - 404
  • [8] Named entity recognition and classification for text in arabic
    Abuleil, S
    Evens, M
    [J]. INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 89 - 94
  • [9] Arabic named entity recognition in crime documents
    Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia, Bangi, Malaysia
    [J]. J. Theor. Appl. Inf. Technol., 1 (1-6):
  • [10] A Survey of Arabic Named Entity Recognition and Classification
    Shaalan, Khaled
    [J]. COMPUTATIONAL LINGUISTICS, 2014, 40 (02) : 469 - 510