Arabic named entity recognition in social media based on BiLSTM-CRF using an attention mechanism

被引:2
|
作者
Benali, B. Ait [1 ]
Mihi, S. [1 ]
Mlouk, A. Ait [2 ]
El Bazi, I [3 ]
Laachfoubi, N. [1 ]
机构
[1] Hassan First Univ Settat, Fac Sci & Tech, IR2M Lab, Settat, Morocco
[2] Uppsala Univ, Dept Informat Technol, Div Sci Comp, Uppsala, Sweden
[3] Sultan Moulay Slimane Univ, Natl Sch Business & Management, Beni Mellal, Morocco
关键词
Arabic named entity recognition (ANER); natural language processing (NLP); multi-head self-attention; BiLSTM; CRF; dialect arabic; social media;
D O I
10.3233/JIFS-211944
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Named Entity Recognition (NER) is a vitally important task of Natural Language Processing (NLP), which aims at finding named entities in natural language text and classifying them into predefined categories such as persons (PER), places (LOC), organizations (ORG), and so on. In the Arabic context, the current NER approaches based on deep learning are mainly based on word embedding or character-level embedding as input. However, using a single granularity representation has problems with out-of-vocabulary (OOV), word embedding errors, and relatively simple semantic content. This paper presents a multi-headed self-attention mechanism implemented in the BiLSTM-CRF neural network structure to recognize Arabic named entities on social media using two embeddings. Unlike other state-of-the-art approaches, this approach combines character and word embedding at the embedding layer, and the attention mechanism calculates the similarity over the entire sequence of characters and captures local context information. The proposed approach better recognized NEs in Dialect Arabic, reaching an F1 value of 74.15% on Darwish's dataset (a publicly available Arabic NER benchmark for social media). According to our knowledge, our findings outperform the current state-of-the-art models for Arabic Named Entity Recognition on social media.
引用
收藏
页码:5427 / 5436
页数:10
相关论文
共 50 条
  • [1] An Attention-Based BiLSTM-CRF Model for Chinese Clinic Named Entity Recognition
    Wu, Guohua
    Tang, Guangen
    Wang, Zhongru
    Zhang, Zhen
    Wang, Zhen
    IEEE ACCESS, 2019, 7 (113942-113949) : 113942 - 113949
  • [2] BiLSTM-CRF for Persian Named-Entity Recognition
    Poostchi, Hanieh
    Borzeshi, Ehsan Zare
    Piccardi, Massimo
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 4427 - 4431
  • [3] Named Entity Recognition From Biomedical Texts Using a Fusion Attention-Based BiLSTM-CRF
    Wei, Hao
    Gao, Mingyuan
    Zhou, Ai
    Chen, Fei
    Qu, Wen
    Wang, Chunli
    Lu, Mingyu
    IEEE ACCESS, 2019, 7 : 73627 - 73636
  • [4] An attention-based BiLSTM-CRF approach to document-level chemical named entity recognition
    Luo, Ling
    Yang, Zhihao
    Yang, Pei
    Zhang, Yin
    Wang, Lei
    Lin, Hongfei
    Wang, Jian
    BIOINFORMATICS, 2018, 34 (08) : 1381 - 1388
  • [5] Named Entity Recognition of Traditional Chinese Medicine Patents Based on BiLSTM-CRF
    Deng, Na
    Fu, Hao
    Chen, Xu
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2021, 2021
  • [6] Named entity recognition of agricultural based entity-level masking BERT and BiLSTM-CRF
    Wei Z.
    Song L.
    Hu X.
    Chen N.
    Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering, 2022, 38 (15): : 195 - 203
  • [7] BiLSTM-CRF for geological named entity recognition from the geoscience literature
    Qinjun Qiu
    Zhong Xie
    Liang Wu
    Liufeng Tao
    Wenjia Li
    Earth Science Informatics, 2019, 12 : 565 - 579
  • [8] BiLSTM-CRF for geological named entity recognition from the geoscience literature
    Qiu, Qinjun
    Xie, Zhong
    Wu, Liang
    Tao, Liufeng
    Li, Wenjia
    EARTH SCIENCE INFORMATICS, 2019, 12 (04) : 565 - 579
  • [9] Drug Specification Named Entity Recognition base on BiLSTM-CRF Model
    Li, Wei-Yan
    Song, Wen-Ai
    Jia, Xin-Hong
    Yang, Ji-Jiang
    Wang, Qing
    Lei, Yi
    Huang, Ke
    Li, Jun
    Yang, Ting
    2019 IEEE 43RD ANNUAL COMPUTER SOFTWARE AND APPLICATIONS CONFERENCE (COMPSAC), VOL 2, 2019, : 429 - 433
  • [10] Named Entity Recognition of Lithium-ion Battery Defects Based on BiLSTM-CRF
    Hu, Jun
    Wan, Wangjun
    Li, Xia
    Wu, Xiangping
    2023 IEEE 6th International Conference on Electronic Information and Communication Technology, ICEICT 2023, 2023, : 459 - 463