3Rs:Data Augmentation Techniques Using Document Contexts For Low-Resource Chinese Named Entity Recognition

被引:0
|
作者
Ying, Zheyu [1 ,2 ]
Zhang, Jinglei [1 ,2 ]
Xie, Rui [1 ]
Wen, Guochang [1 ,2 ]
Xiao, Feng [1 ,2 ]
Liu, Xueyang [1 ]
Zhang, Shikun [1 ]
机构
[1] Peking Univ, Natl Engn Res Ctr Software Engn, Beijing, Peoples R China
[2] Peking Univ, Sch Software & Microelect, Beijing, Peoples R China
关键词
Chinese NER; Data Augmentation; Document-Level; Adversarial Attack; Low-resource;
D O I
10.1109/IJCNN55064.2022.9892341
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With recent advances of neural networks and pre-training techniques, Chinese Named Entity Recognition (NER) has achieved great progress in recent years. However, NER systems still have the problem of generalization ability issues due to lack of annotated data, and current NER models mostly consider input sentences individually, which prevent models from further exploiting cross-sentence document context in training. With regard of these problems, this paper present new insights into Chinese NER and propose 3Rs: three data augmentation methods incorporating document-level information for NER through random concatenating, random swapping and random erasing, which are inspired by some multi-sample data augmentation techniques in computer vision fields, aiming to reorganize the composition of training sentences, and generate more training examples with less human efforts. We conduct extensive experiments on two Chinese datasets, and introduce a two-level attacking method to audit robustness performance. Our experiment results show that even the best model can obtain a better accuracy and robustness, especially for smaller training sets, therefore alleviating performance bottlenecks on low-resource conditions.
引用
收藏
页数:8
相关论文
共 50 条
  • [11] Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
    Zhou, Joey Tianyi
    Zhang, Hao
    Jin, Di
    Zhu, Hongyuan
    Fang, Meng
    Goh, Rick Siow Mong
    Kwok, Kenneth
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3461 - 3471
  • [12] Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition
    Hou, Wenlong
    Zhao, Weidong
    Liu, Xianhui
    Guo, Wenyan
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)
  • [13] LELNER: A Lightweight and Effective Low-resource Named Entity Recognition model
    Zhang, Zhanjun
    Zhang, Haoyu
    Wan, Qian
    Liu, Jie
    KNOWLEDGE-BASED SYSTEMS, 2022, 251
  • [14] A Word Representation to Improve Named Entity Recognition in Low-resource Languages
    Mbouopda, Michael Franklin
    Yonta, Paulin Melatagia
    2019 SIXTH INTERNATIONAL CONFERENCE ON SOCIAL NETWORKS ANALYSIS, MANAGEMENT AND SECURITY (SNAMS), 2019, : 333 - 337
  • [15] Language inference-based learning for Low-Resource Chinese clinical named entity recognition using language model
    Cui, Zhaojian
    Yu, Kai
    Yuan, Zhenming
    Dong, Xiaofeng
    Luo, Weibin
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 149
  • [16] Improving Low-Resource Chinese Named Entity Recognition Using Bidirectional Encoder Representation from Transformers and Lexicon Adapter
    Dang, Xiaochao
    Wang, Li
    Dong, Xiaohui
    Li, Fenfang
    Deng, Han
    APPLIED SCIENCES-BASEL, 2023, 13 (19):
  • [17] Integrating prompt techniques and multi-similarity matching for named entity recognition in low-resource settings
    Yang, Jun
    Yao, Liguo
    Zhang, Taihua
    Tsai, Chieh-Yuan
    Lu, Yao
    Shen, Mingming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [18] Label-Guided Data Augmentation for Chinese Named Entity Recognition
    Jiang, Miao
    Chen, Honghui
    APPLIED SCIENCES-BASEL, 2025, 15 (05):
  • [19] Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMs
    Yohannes, Hailemariam Mehari
    Lynden, Steven
    Amagasa, Toshiyuki
    Matono, Akiyoshi
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 166 - 180
  • [20] A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition
    Yu, Houjin
    Mao, Xian-Ling
    Chi, Zewen
    Wei, Wei
    Huang, Heyan
    11TH IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG 2020), 2020, : 297 - 304