Counterfactual Generator: A Weakly-Supervised Method for Named Entity Recognition

被引:0
|
作者
Zeng, Xiangji [1 ]
Li, Yunliang [1 ]
Zhai, Yuchen [1 ]
Zhang, Yin [1 ]
机构
[1] Zhejiang Univ, Hangzhou, Peoples R China
基金
国家重点研发计划;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Past progress on neural models has proven that named entity recognition is no longer a problem if we have enough labeled data. However, collecting enough data and annotating them are labor-intensive, time-consuming, and expensive. In this paper, we decompose the sentence into two parts: entity and context, and rethink the relationship between them and model performance from a causal perspective. Based on this, we propose the Counterfactual Generator, which generates counterfactual examples by the interventions on the existing observational examples to enhance the original dataset. Experiments across three datasets show that our method improves the generalization ability of models under limited observational examples. Besides, we provide a theoretical foundation by using a structural causal model to explore the spurious correlations between input features and output labels. We investigate the causal effects of entity or context on model performance under both conditions: the non-augmented and the augmented. Interestingly, we find that the non-spurious correlations are more located in entity representation rather than context representation. As a result, our method eliminates part of the spurious correlations between context representation and output labels. The code is available at https://github.com/xijiz/cfgen.
引用
收藏
页码:7270 / 7280
页数:11
相关论文
共 50 条
  • [31] Learning to select pseudo labels: a semi-supervised method for named entity recognition
    Li, Zhen-zhen
    Feng, Da-wei
    Li, Dong-sheng
    Lu, Xi-cheng
    FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2020, 21 (06) : 903 - 916
  • [32] Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss
    Effland, Thomas
    Collins, Michael
    TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, 2021, 9 : 1320 - 1335
  • [33] A weakly-supervised follicle segmentation method in ultrasound images
    Guanyu Liu
    Weihong Huang
    Yanping Li
    Qiong Zhang
    Jing Fu
    Hongying Tang
    Jia Huang
    Zhongteng Zhang
    Lei Zhang
    Yu Wang
    Jianzhong Hu
    Scientific Reports, 15 (1)
  • [34] A Weakly-supervised Method for Encrypted Malicious Traffic Detection
    Liu, Junyi
    Li, Zhenyu
    Wang, Jiarong
    Yan, Tian
    An, Dehai
    Zhou, Caiqiu
    Chen, Gang
    INTERNATIONAL SYMPOSIUM ON GRIDS & CLOUDS 2022, 2022,
  • [35] A Weakly-Supervised Factorization Method with Dynamic Graph Embedding
    Seyedi, Seyed Amjad
    Moradi, Parham
    Tab, Fardin Akhlaghian
    2017 19TH CSI INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2017, : 213 - 218
  • [36] Distantly Supervised Biomedical Named Entity Recognition with Dictionary Expansion
    Wang, Xuan
    Zhang, Yu
    Li, Qi
    Ren, Xiang
    Shang, Jingbo
    Han, Jiawei
    2019 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE (BIBM), 2019, : 496 - 503
  • [37] Distantly Supervised Named Entity Recognition Combined with Prototypical Networks
    Luo S.
    Lin Z.
    Pan L.
    Wu Z.
    Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology, 2023, 43 (04): : 410 - 416
  • [38] A Semi-Supervised Algorithm for Indonesian Named Entity Recognition
    Leonandya, Rezka Aufar
    Distiawan, Bayu
    Praptono, Nursidik Heru
    2015 3RD INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI 2015), 2015, : 45 - 50
  • [39] Named entity recognition: a semi-supervised learning approach
    Sintayehu H.
    Lehal G.S.
    International Journal of Information Technology, 2021, 13 (4) : 1659 - 1665
  • [40] The effect of answer patterns for supervised named entity recognition in Thai
    Tirasaroj, Nutcha
    Aroonmanakun, Wirote
    PACLIC 25 - Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation, 2011, : 392 - 399