Investigating annotation noise for named entity recognition

被引:3
|
作者
Zhu, Yu [1 ]
Ye, Yingchun [1 ]
Li, Mengyang [1 ,2 ]
Zhang, Ji [3 ]
Wu, Ou [1 ]
机构
[1] Tianjin Univ, Natl Ctr Appl Math, Weijin Rd, Tianjin 300072, Peoples R China
[2] Jiuantianxia Inc, Jinguan North Second St, Beijing 100102, Peoples R China
[3] Zhejiang Lab, Wenyi West Rd, Hangzhou 311100, Zhejiang, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 01期
关键词
Information extraction; Named entity recognition; Noisy labels; Bayesian neural network;
D O I
10.1007/s00521-022-07733-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies revealed that even the most widely used benchmark dataset still contains more than 5% sample-level annotation noise in Named Entity Recognition (NER). Hence, we investigate annotation noise in terms of noise detection and noise-robust learning. First, considering that noisy labels usually occur when few or vague annotation cues appear in annotated texts and their contexts, an annotation noise detection model is constructed based on self-context contrastive loss. Second, an improved Bayesian neural network (BNN) is presented by adding a learnable systematic deviation term into the label generation processing of classical BNN. In addition, two learning strategies of systematic deviation items based on the output of the noise detection model are proposed. Experimental results of our proposed noise detection model show an improvement of up to 7.44% F1 on CoNLL03 than the existing method. Extensive experiments on two widely used but noisy benchmarks for NER, CoNLL03 and WNUT17 demonstrate that our proposed systematic deviation BNN has the potential to capture systematic annotation mistakes, and it can be extended to other areas with annotation noise.
引用
收藏
页码:993 / 1007
页数:15
相关论文
共 50 条
  • [1] Investigating annotation noise for named entity recognition
    Yu Zhu
    Yingchun Ye
    Mengyang Li
    Ji Zhang
    Ou Wu
    Neural Computing and Applications, 2023, 35 : 993 - 1007
  • [2] On active annotation for named entity recognition
    Ekbal, Asif
    Saha, Sriparna
    Sikdar, Utpal Kumar
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2016, 7 (04) : 623 - 640
  • [3] On active annotation for named entity recognition
    Asif Ekbal
    Sriparna Saha
    Utpal Kumar Sikdar
    International Journal of Machine Learning and Cybernetics, 2016, 7 : 623 - 640
  • [4] Ensemble based Active Annotation for Named Entity Recognition
    Ekbal, Asif
    Saha, Sriparna
    Singh, Dhirendra
    2012 THIRD INTERNATIONAL CONFERENCE ON EMERGING APPLICATIONS OF INFORMATION TECHNOLOGY (EAIT), 2012, : 331 - 334
  • [5] The impact of using different annotation schemes on named entity recognition
    Alshammari, Nasser
    Alanazi, Saad
    EGYPTIAN INFORMATICS JOURNAL, 2021, 22 (03) : 295 - 302
  • [6] Ensemble based Active Annotation for Biomedical Named Entity Recognition
    Verma, Mridula
    Sikdar, Utpal
    Saha, Sriparna
    Ekbal, Asif
    2013 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2013, : 973 - 978
  • [7] SANTA: Separate Strategies for Inaccurate and Incomplete Annotation Noise in Distantly-Supervised Named Entity Recognition
    Si, Shuzheng
    Cai, Zefan
    Zeng, Shuang
    Feng, Guoqiang
    Lin, Jiaxing
    Chang, Baobao
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3883 - 3896
  • [8] Investigating the Effect of ASR tuning on Named Entity Recognition
    Ben Jannet, Mohamed Ameur
    Galibert, Olivier
    Adda-Decker, Martine
    Rosset, Sophie
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2486 - 2490
  • [9] Noise Detection for Distant Supervised Named Entity Recognition
    Wang J.
    Wang K.
    Wang H.
    Du W.
    He Z.
    Ruan T.
    Liu J.
    Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2024, 61 (04): : 916 - 928
  • [10] An Arabic Dataset for Disease Named Entity Recognition with Multi-Annotation Schemes
    Alshammari, Nasser
    Alanazi, Saad
    DATA, 2020, 5 (03) : 1 - 8