Learning to select pseudo labels: a semi-supervised method for named entity recognition

被引:0
|
作者
Zhen-zhen Li
Da-wei Feng
Dong-sheng Li
Xi-cheng Lu
机构
[1] National University of Defense Technology,College of Computer
关键词
Named entity recognition; Unlabeled data; Deep learning; Semi-supervised method; TP391.1;
D O I
暂无
中图分类号
学科分类号
摘要
Deep learning models have achieved state-of-the-art performance in named entity recognition (NER); the good performance, however, relies heavily on substantial amounts of labeled data. In some specific areas such as medical, financial, and military domains, labeled data is very scarce, while unlabeled data is readily available. Previous studies have used unlabeled data to enrich word representations, but a large amount of entity information in unlabeled data is neglected, which may be beneficial to the NER task. In this study, we propose a semi-supervised method for NER tasks, which learns to create high-quality labeled data by applying a pre-trained module to filter out erroneous pseudo labels. Pseudo labels are automatically generated for unlabeled data and used as if they were true labels. Our semi-supervised framework includes three steps: constructing an optimal single neural model for a specific NER task, learning a module that evaluates pseudo labels, and creating new labeled data and improving the NER model iteratively. Experimental results on two English NER tasks and one Chinese clinical NER task demonstrate that our method further improves the performance of the best single neural model. Even when we use only pre-trained static word embeddings and do not rely on any external knowledge, our method achieves comparable performance to those state-of-the-art models on the CoNLL-2003 and OntoNotes 5.0 English NER tasks.
引用
收藏
页码:903 / 916
页数:13
相关论文
共 50 条
  • [1] Learning to select pseudo labels: a semi-supervised method for named entity recognition
    Li, Zhen-zhen
    Feng, Da-wei
    Li, Dong-sheng
    Lu, Xi-cheng
    [J]. FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (06) : 903 - 916
  • [2] Named entity recognition: a semi-supervised learning approach
    Sintayehu H.
    Lehal G.S.
    [J]. International Journal of Information Technology, 2021, 13 (4) : 1659 - 1665
  • [3] A Semi-Supervised Algorithm for Indonesian Named Entity Recognition
    Leonandya, Rezka Aufar
    Distiawan, Bayu
    Praptono, Nursidik Heru
    [J]. 2015 3RD INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI 2015), 2015, : 45 - 50
  • [4] Uncertainty-Aware Contrastive Learning for semi-supervised named entity recognition
    Yang, Kang
    Yang, Zhiwei
    Zhao, Songwei
    Yang, Zhejian
    Zhang, Sinuo
    Chen, Hechang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [5] Semi-Supervised Noisy Label Learning for Chinese Clinical Named Entity Recognition
    Li, Zhucong
    Gan, Zhen
    Zhang, Baoli
    Chen, Yubo
    Wan, Jing
    Liu, Kang
    Zhao, Jun
    Liu, Shengping
    [J]. DATA INTELLIGENCE, 2021, 3 (03) : 389 - 401
  • [6] Semi-Supervised Noisy Label Learning for Chinese Clinical Named Entity Recognition
    Zhucong Li
    Zhen Gan
    Baoli Zhang
    Yubo Chen
    Jing Wan
    Kang Liu
    Jun Zhao
    Shengping Liu
    [J]. Data Intelligence, 2021, 3 (03) : 389 - 401
  • [7] Semi-supervised disentangled framework for transferable named entity recognition
    Hao, Zhifeng
    Lv, Di
    Li, Zijian
    Cai, Ruichu
    Wen, Wen
    Xu, Boyan
    [J]. NEURAL NETWORKS, 2021, 135 : 127 - 138
  • [8] Semi-Supervised Learning for Named Entity Recognition Using Weakly Labeled Training Data
    Zafarian, Atefeh
    Rokni, Ali
    Khadivi, Shahram
    Ghiasifard, Sonia
    [J]. 2015 INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2015, : 129 - 135
  • [9] A Hybrid Approach of Pattern Extraction and Semi-supervised Learning for Vietnamese Named Entity Recognition
    Vo, Duc-Thuan
    Ock, Cheol-Young
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE - TECHNOLOGIES AND APPLICATIONS, PT I, 2012, 7653 : 83 - 93
  • [10] Named Entity Recognition on Twitter for Turkish using Semi-supervised Learning with Word Embeddings
    Okur, Eda
    Demir, Hakan
    Ozgur, Arzucan
    [J]. LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 549 - 555