CoTea: Collaborative teaching for low-resource named entity recognition with a divide-and-conquer strategy

被引:0
|
作者
Yang, Zhiwei [1 ,2 ]
Ma, Jing [3 ]
Yang, Kang [4 ]
Lin, Huiru [5 ]
Chen, Hechang [4 ]
Yang, Ruichao [3 ]
Chang, Yi [4 ,6 ]
机构
[1] Jinan Univ, Guangdong Inst Smart Educ, Guangzhou, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[4] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
[5] Jinan Univ, Inst Phys Educ, Guangzhou, Peoples R China
[6] Jilin Univ, Int Ctr Future Sci, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
Low resource; Named entity recognition; Collaborative teaching; Divide-and-conquer;
D O I
10.1016/j.ipm.2024.103657
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Low -resource named entity recognition (NER) aims to identify entity mentions when training data is scarce. Recent approaches resort to distant data with manual dictionaries for improvement, but such dictionaries are not always available for the target domain and have limited coverage of entities, which may introduce noise. In this paper, we propose a novel Collaborative Teaching (CoTea) framework for low -resource NER with a few supporting labeled examples, which can automatically augment training data and reduce label noise. Specifically, CoTea utilizes the entities in the supporting labeled examples to retrieve entity -related unlabeled data heuristically and then generates accurate distant labels with a novel mining -refining iterative mechanism. For optimizing distant labels, the mechanism mines potential entities from non -entity tokens with a recognition teacher and then refines entity labels with another prompt -based discrimination teacher in a divide -and -conquer manner. Experimental results on two benchmark datasets demonstrate that CoTea outperforms state-of-the-art baselines in lowresource settings and achieves 85% and 65% performance levels of the best high -resource baseline methods by merely utilizing about 2% of labeled data.
引用
收藏
页数:17
相关论文
共 48 条
  • [41] Low Resource Chinese Geological Text Named Entity Recognition Based on Prompt Learning
    Hang He
    Chao Ma
    Shan Ye
    Wenqiang Tang
    Yuxuan Zhou
    Zhen Yu
    Jiaxin Yi
    Li Hou
    Mingcai Hou
    Journal of Earth Science, 2024, 35 (03) : 1035 - 1043
  • [42] Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer
    Feng, Xiaocheng
    Feng, Xiachong
    Qin, Bing
    Feng, Zhangyin
    Liu, Ting
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 4071 - 4077
  • [43] MLlab4CS at SemEval-2023 Task 2: Named Entity Recognition in low-resource language Bangla using Multilingual Language Models
    Mukherjee, Shrimon
    Ghosh, Madhusudan
    Girish
    Basuchowdhuri, Partha
    17TH INTERNATIONAL WORKSHOP ON SEMANTIC EVALUATION, SEMEVAL-2023, 2023, : 1388 - 1394
  • [44] Correlation-guided decoding strategy for low-resource Uyghur scene text recognition
    Xu, Miaomiao
    Zhang, Jiang
    Xu, Lianghui
    Silamu, Wushour
    Li, Yanbing
    COMPLEX & INTELLIGENT SYSTEMS, 2025, 11 (01)
  • [45] NEB-Filter: A Simple but Effective Filter based on Named Entity Boundaries for Low-Resource Cross-Lingual NER
    Jian, Linzhen
    Jian, Ping
    Fei, Weilun
    2022 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2022), 2022, : 482 - 487
  • [46] Battling with the low-resource condition for snore sound recognition: introducing a meta-learning strategy
    Li, Jingtan
    Sun, Mengkai
    Zhao, Zhonghao
    Li, Xingcan
    Li, Gaigai
    Wu, Chen
    Qian, Kun
    Hu, Bin
    Yamamoto, Yoshiharu
    Schuller, Bjoern W.
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [47] Battling with the low-resource condition for snore sound recognition: introducing a meta-learning strategy
    Jingtan Li
    Mengkai Sun
    Zhonghao Zhao
    Xingcan Li
    Gaigai Li
    Chen Wu
    Kun Qian
    Bin Hu
    Yoshiharu Yamamoto
    Björn W. Schuller
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [48] Low Resource Named Entity Recognition Using Contextual Word Representation and Neural Cross-Lingual Knowledge Transfer
    Han, Soyeon Caren
    Lin, Yingru
    Long, Siqu
    Poon, Josiah
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT I, 2019, 11953 : 299 - 311