Exogenous and Endogenous Data Augmentation for Low-Resource Complex Named Entity Recognition

被引:0
|
作者
Zhang, Xinghua [1 ,2 ]
Chen, Gaode [1 ,2 ]
Cui, Shiyao [1 ,2 ]
Sheng, Jiawei [1 ,2 ]
Liu, Tingwen [1 ,2 ]
Xu, Hongbo [1 ,2 ]
机构
[1] Univ Chinese Acad Sci, Sch Cyber Secur, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Informat Engn, Beijing, Peoples R China
关键词
Knowledge Acquisition; Data Augmentation; Named Entity Recognition; Low-resource learning;
D O I
10.1145/3626772.3657754
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Low-resource Complex Named Entity Recognition aims to detect entities with the form of any linguistic constituent under scenarios with limited manually annotated data. Existing studies augment the text through the substitution of same type entities or language modeling, but suffer from the lower quality and the limited entity context patterns within low-resource corpora. In this paper, we propose a novel data augmentation method E(2)DA from both exogenous and endogenous perspectives. As for exogenous augmentation, we treat the limited manually annotated data as anchors, and leverage the powerful instruction-following capabilities of Large Language Models (LLMs) to expand the anchors by generating data that are highly dissimilar from the original anchor texts in terms of entity mentions and contexts. As regards the endogenous augmentation, we explore diverse semantic directions in the implicit feature space of the original and expanded anchors for effective data augmentation. Our complementary augmentation method from two perspectives not only continuously expands the global text-level space, but also fully explores the local semantic space for more diverse data augmentation. Extensive experiments on 10 diverse datasets across various low-resource settings demonstrate that the proposed method excels significantly over prior state-of-the-art data augmentation methods.
引用
收藏
页码:630 / 640
页数:11
相关论文
共 50 条
  • [1] Improving Low-resource Named Entity Recognition with Graph Propagated Data Augmentation
    Cai, Jiong
    Huang, Shen
    Jiang, Yong
    Tan, Zeqi
    Xie, Pengjun
    Tu, Kewei
    61ST CONFERENCE OF THE THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 2, 2023, : 110 - 118
  • [2] Constrained Labeled Data Generation for Low-Resource Named Entity Recognition
    Guo, Ruohao
    Roth, Dan
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-IJCNLP 2021, 2021, : 4519 - 4533
  • [3] AUC Maximization for Low-Resource Named Entity Recognition
    Nguyen, Ngoc Dang
    Tan, Wei
    Du, Lan
    Buntine, Wray
    Beare, Richard
    Chen, Changyou
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 11, 2023, : 13389 - 13399
  • [4] Enhancement of Named Entity Recognition in Low-Resource Languages with Data Augmentation and BERT Models: A Case Study on Urdu
    Ullah, Fida
    Gelbukh, Alexander
    Zamir, Muhammad Tayyab
    Riveron, Edgardo Manuel Felipe
    Sidorov, Grigori
    COMPUTERS, 2024, 13 (10)
  • [5] Biomedical Named Entity Recognition Under Low-Resource Situation
    Zhao, Jianfei
    Ren, Xiangyu
    Zhao, Shuo
    Li, Jinyi
    HEALTH INFORMATION PROCESSING. EVALUATION TRACK PAPERS, 2023, 1773 : 41 - 47
  • [6] 3Rs:Data Augmentation Techniques Using Document Contexts For Low-Resource Chinese Named Entity Recognition
    Ying, Zheyu
    Zhang, Jinglei
    Xie, Rui
    Wen, Guochang
    Xiao, Feng
    Liu, Xueyang
    Zhang, Shikun
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [7] RoPDA: Robust Prompt -Based Data Augmentation for Low -Resource Named Entity Recognition
    Song, Sihan
    Shen, Furao
    Zhao, Jian
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19017 - 19025
  • [8] Converse Attention Knowledge Transfer for Low-Resource Named Entity Recognition
    School of Computer Science and Technology, University of Science and Technology of China, Hefei
    230027, China
    不详
    639798, Singapore
    Int. J. Crowd. Sci., 2024, 3 (140-148):
  • [9] Dual Adversarial Neural Transfer for Low-Resource Named Entity Recognition
    Zhou, Joey Tianyi
    Zhang, Hao
    Jin, Di
    Zhu, Hongyuan
    Fang, Meng
    Goh, Rick Siow Mong
    Kwok, Kenneth
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 3461 - 3471
  • [10] Knowledge-Enriched Prompt for Low-Resource Named Entity Recognition
    Hou, Wenlong
    Zhao, Weidong
    Liu, Xianhui
    Guo, Wenyan
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (05)