Investigating annotation noise for named entity recognition

被引:3
|
作者
Zhu, Yu [1 ]
Ye, Yingchun [1 ]
Li, Mengyang [1 ,2 ]
Zhang, Ji [3 ]
Wu, Ou [1 ]
机构
[1] Tianjin Univ, Natl Ctr Appl Math, Weijin Rd, Tianjin 300072, Peoples R China
[2] Jiuantianxia Inc, Jinguan North Second St, Beijing 100102, Peoples R China
[3] Zhejiang Lab, Wenyi West Rd, Hangzhou 311100, Zhejiang, Peoples R China
来源
NEURAL COMPUTING & APPLICATIONS | 2023年 / 35卷 / 01期
关键词
Information extraction; Named entity recognition; Noisy labels; Bayesian neural network;
D O I
10.1007/s00521-022-07733-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent studies revealed that even the most widely used benchmark dataset still contains more than 5% sample-level annotation noise in Named Entity Recognition (NER). Hence, we investigate annotation noise in terms of noise detection and noise-robust learning. First, considering that noisy labels usually occur when few or vague annotation cues appear in annotated texts and their contexts, an annotation noise detection model is constructed based on self-context contrastive loss. Second, an improved Bayesian neural network (BNN) is presented by adding a learnable systematic deviation term into the label generation processing of classical BNN. In addition, two learning strategies of systematic deviation items based on the output of the noise detection model are proposed. Experimental results of our proposed noise detection model show an improvement of up to 7.44% F1 on CoNLL03 than the existing method. Extensive experiments on two widely used but noisy benchmarks for NER, CoNLL03 and WNUT17 demonstrate that our proposed systematic deviation BNN has the potential to capture systematic annotation mistakes, and it can be extended to other areas with annotation noise.
引用
收藏
页码:993 / 1007
页数:15
相关论文
共 50 条
  • [41] A Method of Named Entity Recognition for Tigrinya
    Yohannes, Hailemariam Mehari
    Amagasa, Toshiyuki
    APPLIED COMPUTING REVIEW, 2022, 22 (03): : 56 - 68
  • [42] Named entity recognition without gazetteers
    Mikheev, A
    Moens, M
    Grover, C
    NINTH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS, 1999, : 1 - 8
  • [43] Named Entity Recognition in Marathi Language
    Kale, Shrutika
    Govilkar, Sharvari
    INTERNATIONAL CONFERENCE ON INTELLIGENT DATA COMMUNICATION TECHNOLOGIES AND INTERNET OF THINGS, ICICI 2018, 2019, 26 : 371 - 377
  • [44] Named Entity Recognition for Nepali Language
    Singh, Oyesh Mann
    Padia, Ankur
    Joshi, Anupam
    2019 IEEE 5TH INTERNATIONAL CONFERENCE ON COLLABORATION AND INTERNET COMPUTING (CIC 2019), 2019, : 184 - 190
  • [45] A review of biomedical named entity recognition
    Chang, Lu
    Zhang, Ruihuan
    Lv, Jia
    Zhou, Weiguang
    Bai, Yunli
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2022, 22 (03) : 893 - 900
  • [46] Named entity recognition for the Kazakh language
    Kozhirbayev, Z. M.
    Yessenbayev, Z. A.
    JOURNAL OF MATHEMATICS MECHANICS AND COMPUTER SCIENCE, 2020, 107 (03): : 57 - 66
  • [47] Biomedical named entity recognition system
    Patrick, J. (jonpat@it.usyd.edu.au), 2005, School of Information Technologies
  • [48] Named Entity Recognition for Sinhala Language
    Dahanayaka, J. K.
    Weerasinghe, A. R.
    14TH INTERNATIONAL CONFERENCE ON ADVANCES IN ICT FOR EMERGING REGIONS (ICTER) 2014, 2014, : 215 - 220
  • [49] Named entity recognition in Vietnamese documents
    Tri Tran, Q.
    Thao Pham, T.X.
    Hung Ngo, Q.
    Dinh, Dien
    Collier, Nigel
    Progress in Informatics, 2007, (04): : 5 - 13
  • [50] Named Entity Recognition in Vietnamese Tweets
    Nguyen, Vu H.
    Nguyen, Hien T.
    Snasel, Vaclav
    COMPUTATIONAL SOCIAL NETWORKS, CSONET 2015, 2015, 9197 : 205 - 215