CoTea: Collaborative teaching for low-resource named entity recognition with a divide-and-conquer strategy

被引:0
|
作者
Yang, Zhiwei [1 ,2 ]
Ma, Jing [3 ]
Yang, Kang [4 ]
Lin, Huiru [5 ]
Chen, Hechang [4 ]
Yang, Ruichao [3 ]
Chang, Yi [4 ,6 ]
机构
[1] Jinan Univ, Guangdong Inst Smart Educ, Guangzhou, Peoples R China
[2] Jilin Univ, Coll Comp Sci & Technol, Changchun, Peoples R China
[3] Hong Kong Baptist Univ, Dept Comp Sci, Hong Kong, Peoples R China
[4] Jilin Univ, Sch Artificial Intelligence, Changchun, Peoples R China
[5] Jinan Univ, Inst Phys Educ, Guangzhou, Peoples R China
[6] Jilin Univ, Int Ctr Future Sci, Changchun, Peoples R China
基金
中国国家自然科学基金;
关键词
Low resource; Named entity recognition; Collaborative teaching; Divide-and-conquer;
D O I
10.1016/j.ipm.2024.103657
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Low -resource named entity recognition (NER) aims to identify entity mentions when training data is scarce. Recent approaches resort to distant data with manual dictionaries for improvement, but such dictionaries are not always available for the target domain and have limited coverage of entities, which may introduce noise. In this paper, we propose a novel Collaborative Teaching (CoTea) framework for low -resource NER with a few supporting labeled examples, which can automatically augment training data and reduce label noise. Specifically, CoTea utilizes the entities in the supporting labeled examples to retrieve entity -related unlabeled data heuristically and then generates accurate distant labels with a novel mining -refining iterative mechanism. For optimizing distant labels, the mechanism mines potential entities from non -entity tokens with a recognition teacher and then refines entity labels with another prompt -based discrimination teacher in a divide -and -conquer manner. Experimental results on two benchmark datasets demonstrate that CoTea outperforms state-of-the-art baselines in lowresource settings and achieves 85% and 65% performance levels of the best high -resource baseline methods by merely utilizing about 2% of labeled data.
引用
收藏
页数:17
相关论文
共 48 条
  • [21] Embedding Transfer for Low-Resource Medical Named Entity Recognition: A Case Study on Patient Mobility
    Newman-Griffis, Denis
    Zirikly, Ayah
    SIGBIOMED WORKSHOP ON BIOMEDICAL NATURAL LANGUAGE PROCESSING (BIONLP 2018), 2018, : 1 - 11
  • [22] Semi-supervised Named Entity Recognition for Low-Resource Languages Using Dual PLMs
    Yohannes, Hailemariam Mehari
    Lynden, Steven
    Amagasa, Toshiyuki
    Matono, Akiyoshi
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, PT I, NLDB 2024, 2024, 14762 : 166 - 180
  • [23] A NEW APPROACH OF PARSING AND SEARCH BASED ON THE DIVIDE-AND-CONQUER STRATEGY FOR CONTINUOUS SPEECH RECOGNITION
    WANG, MS
    IMAI, S
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 1995, E78D (04) : 455 - 465
  • [24] Named-Entity Recognition for a Low-resource Language using Pre-Trained Language Model
    Yohannes, Hailemariam Mehari
    Amagasa, Toshiyuki
    37TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2022, : 837 - 844
  • [25] Integrating prompt techniques and multi-similarity matching for named entity recognition in low-resource settings
    Yang, Jun
    Yao, Liguo
    Zhang, Taihua
    Tsai, Chieh-Yuan
    Lu, Yao
    Shen, Mingming
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
  • [26] ECTTLNER: An Effective Cross-Task Transferring Learning Method for Low-Resource Named Entity Recognition
    Xu, Yiwu
    Chen, Yun
    NEURAL PROCESSING LETTERS, 2025, 57 (01)
  • [27] Divide-and-conquer strategy incorporated fisher linear discriminant analysis: An efficient approach for face recognition
    Noushath, S.
    Kumar, G. Hemantha
    Aradhya, V. N. Manjunath
    Shivakumara, P.
    PROCEEDINGS OF THE SIXTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION, 2007, : 40 - +
  • [28] A divide-and-conquer control strategy with decentralized control barrier function for luggage trolley transportation by collaborative robots
    Gao, Xuheng
    Luan, Hao
    Xia, Bingyi
    Zhao, Ziqi
    Wang, Jiankun
    Meng, Max Q. -H.
    ROBOTICA, 2023, 41 (11) : 3333 - 3348
  • [29] Enhancement of Named Entity Recognition in Low-Resource Languages with Data Augmentation and BERT Models: A Case Study on Urdu
    Ullah, Fida
    Gelbukh, Alexander
    Zamir, Muhammad Tayyab
    Riveron, Edgardo Manuel Felipe
    Sidorov, Grigori
    COMPUTERS, 2024, 13 (10)
  • [30] Enhancing Low-resource Fine-grained Named Entity Recognition by Leveraging Coarse-grained Datasets
    Lee, Su Ah
    Oh, Seokjin
    Jung, Woohwan
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING, EMNLP 2023, 2023, : 3269 - 3279