Investigating annotation noise for named entity recognition

被引:3
|
作者
Zhu, Yu [1 ]
Ye, Yingchun [1 ]
Li, Mengyang [1 ,2 ]
Zhang, Ji [3 ]
Wu, Ou [1 ]
机构
[1] Tianjin Univ, Natl Ctr Appl Math, Weijin Rd, Tianjin 300072, Peoples R China
[2] Jiuantianxia Inc, Jinguan North Second St, Beijing 100102, Peoples R China
[3] Zhejiang Lab, Wenyi West Rd, Hangzhou 311100, Zhejiang, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 01期
关键词
Information extraction; Named entity recognition; Noisy labels; Bayesian neural network;
D O I
10.1007/s00521-022-07733-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies revealed that even the most widely used benchmark dataset still contains more than 5% sample-level annotation noise in Named Entity Recognition (NER). Hence, we investigate annotation noise in terms of noise detection and noise-robust learning. First, considering that noisy labels usually occur when few or vague annotation cues appear in annotated texts and their contexts, an annotation noise detection model is constructed based on self-context contrastive loss. Second, an improved Bayesian neural network (BNN) is presented by adding a learnable systematic deviation term into the label generation processing of classical BNN. In addition, two learning strategies of systematic deviation items based on the output of the noise detection model are proposed. Experimental results of our proposed noise detection model show an improvement of up to 7.44% F1 on CoNLL03 than the existing method. Extensive experiments on two widely used but noisy benchmarks for NER, CoNLL03 and WNUT17 demonstrate that our proposed systematic deviation BNN has the potential to capture systematic annotation mistakes, and it can be extended to other areas with annotation noise.
引用
收藏
页码:993 / 1007
页数:15
相关论文
共 50 条
  • [31] Speech recognition of a named entity
    Tomita, T
    Okimoto, Y
    Yamamoto, H
    Sagisaka, Y
    2005 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1-5: SPEECH PROCESSING, 2005, : 1057 - 1060
  • [32] Named Entity Recognition in Query
    Guo, Jiafeng
    Xu, Gu
    Cheng, Xueqi
    Li, Hang
    PROCEEDINGS 32ND ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2009, : 267 - 274
  • [33] Named Entity Recognition via Noise Aware Training Mechanism with Data Filter
    Huang, Xiusheng
    Chen, Yubo
    Wu, Shun
    Zhao, Jun
    Xie, Yuantao
    Sun, Weijian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4791 - 4803
  • [34] Vocabulary Matters: An Annotation Pipeline and Four Deep Learning Algorithms for Enzyme Named Entity Recognition
    Wang, Meiqi
    Vijayaraghavan, Avish
    Beck, Tim
    Posma, Joram M.
    JOURNAL OF PROTEOME RESEARCH, 2024, 23 (06) : 1915 - 1925
  • [35] Spatial Named Entity Recognition in Literary Texts: What is the Influence of OCR Noise?
    Koudoro-Parfait, Caroline
    Lejeune, Gael
    Roe, Glenn
    PROCEEDINGS OF THE 5TH ACM SIGSPATIAL INTERNATIONAL WORKSHOP ON GEOSPATIAL HUMANITIES, GEOHUMANITIES 2021, 2020, : 13 - 21
  • [36] Named Entity Annotation Corpus for Commercial Opportunity Mining
    Shi, Lulu
    Qi, Yongjie
    Ma, Hongchao
    Zhang, Kunli
    Zan, Hongying
    Zhou, Qinglei
    2022 INTERNATIONAL CONFERENCE ON FRONTIERS OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING, FAIML, 2022, : 97 - 103
  • [37] Joint Learning of Named Entity Recognition and Entity Linking
    Martins, Pedro Henrique
    Marinho, Zita
    Martins, Andre F. T.
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019:): STUDENT RESEARCH WORKSHOP, 2019, : 190 - 196
  • [38] A Survey on Multimodal Named Entity Recognition
    Qian, Shenyi
    Jin, Wenduo
    Chen, Yonggang
    Ma, Jiangtao
    Qiao, Yaqiong
    Lu, Jinyu
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 609 - 622
  • [39] Named Entity Recognition for Mongolian Language
    Munkhjargal, Zoljargal
    Bella, Gabor
    Chagnaa, Altangerel
    Giunchiglia, Fausto
    TEXT, SPEECH, AND DIALOGUE (TSD 2015), 2015, 9302 : 243 - 251
  • [40] A composite kernel for named entity recognition
    Saha, Sujan Kumar
    Narayan, Shashi
    Sarkar, Sudeshna
    Mitra, Pabitra
    PATTERN RECOGNITION LETTERS, 2010, 31 (12) : 1591 - 1597