Named Entity Recognition of Tunisian Arabic Using the Bi-LSTM-CRF Model

被引:2
|
作者
Mekki, Asma [1 ]
Zribi, Ines [2 ]
Ellouze, Mariem [1 ]
Belguith, Lamia Hadrich [1 ]
机构
[1] Univ Sfax, ANLP Res Grp, MIRACL, Sfax, Tunisia
[2] Univ Monastir, ANLP Res Grp, MIRACL, Monastir, Tunisia
关键词
Named entity recognition; Arabic dialect; Tunisian Arabic; Bi-LSTM-CRF;
D O I
10.1142/S0218213023500628
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is an NLP field that deals with recognizing and classifying entities in written text. Most Arabic NER research studies discuss the Arabic NER challenge for the Modern Standard Arabic (MSA) language. However, the presence of dialectal Arabic textual resources in social media, blogs, TV shows, etc. is increasingly progressive. Therefore, the treatment of named entities is rapidly becoming a necessity, particularly for dialectal Arabic. In this paper, we are interested in the collection and annotation of a corpus as well as the realization of a NER system for Tunisian Arabic (TA), named TUNER. To the best of the researchers' knowledge, this is the first study that uses the suggested method for this purpose. In the present study, we adopt a hybrid method based on a Bi-LSTM-CRF model and a rule-based method. The proposed TUNER system yields an F-measure of 91.43%. This is an interesting improvement over comparable related work dialectal Arabic NER systems.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] 基于Bi-LSTM-CRF网络的语义槽识别
    徐梓翔
    车万翔
    刘挺
    智能计算机与应用, 2017, 7 (06) : 91 - 94
  • [42] Automatic biographical information extraction from local gazetteers with Bi-LSTM-CRF model and BERT
    Zhou Liu
    Hongsu Wang
    Peter K. Bol
    International Journal of Digital Humanities, 2023, 4 (1-3) : 195 - 212
  • [43] Attention-Based Bi-LSTM for Chinese Named Entity Recognition
    Zhang, Kai
    Ren, Weiping
    Zhang, Yangsen
    CHINESE LEXICAL SEMANTICS, CLSW 2018, 2018, 11173 : 643 - 652
  • [44] 融合attention机制的BI-LSTM-CRF中文分词模型
    黄丹丹
    郭玉翠
    软件, 2018, 39 (10) : 260 - 266
  • [45] 采用BI-LSTM-CRF模型的数值信息抽取
    王竣平
    白宇
    蔡东风
    计算机应用与软件, 2019, 36 (05) : 138 - 144
  • [46] Legal Entity Recognition in Indonesian Court Decision Documents Using Bi-LSTM and CRF Approaches
    Nuranti, Eka Qadri
    Yulianti, Evi
    ICACSIS 2020: 2020 12TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS), 2020, : 429 - 434
  • [47] 基于BI-LSTM-CRF模型的中文分词法
    张子睿
    刘云清
    长春理工大学学报(自然科学版), 2017, 40 (04) : 87 - 92
  • [48] A New Approach for Arabic Named Entity Recognition
    Karaa, Wahiba
    Slimani, Thabet
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 332 - 338
  • [49] A Survey of Arabic Named Entity Recognition and Classification
    Shaalan, Khaled
    COMPUTATIONAL LINGUISTICS, 2014, 40 (02) : 469 - 510
  • [50] Named entity recognition and classification for text in arabic
    Abuleil, S
    Evens, M
    INTELLIGENT AND ADAPTIVE SYSTEMS AND SOFTWARE ENGINEERING, 2004, : 89 - 94