Named Entity Recognition Method Based on Multi-Teacher Collaborative Cyclical Knowledge Distillation

被引:0
|
作者
Jin, Chunqiao [1 ]
Yang, Shuangyuan [1 ]
机构
[1] Xiamen Univ, Xiamen, Peoples R China
关键词
Collaborative theory; knowledge distillation; named entity recognition;
D O I
10.1109/CSCWD61410.2024.10580765
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Named Entity Recognition (NER) is a crucial task in Natural Language Processing (NLP), with applications ranging from information retrieval to biomedical research. Large pre-trained language models, like BERT, have significantly improved NER performance, but they require substantial computational resources. Knowledge distillation, a method where a smaller "student" model learns from a larger "teacher" model, can compress models while retaining their effectiveness. This paper introduces Multi-Teacher Collaborative Cyclical Knowledge Distillation (MTCCKD), a novel approach inspired by collaborative learning. MTCCKD addresses the "curse of competence gap" by using multiple teachers of varying expertise. In each iteration, the student assesses its performance and decides whether to change teachers. This collection of teachers collaboratively works to enhance the student model. MTCCKD effectively compresses knowledge while maintaining or even improving NER performance, improving efficiency, adaptability, and robustness. Empirical validation on publicly available NER datasets demonstrates that MTCCKD outperforms state-of-the-art models, achieving a 22-fold model compression while preserving 96% of the teacher model's performance. This method offers a promising solution for practical NER tasks in resource-constrained environments.
引用
收藏
页码:230 / 235
页数:6
相关论文
共 50 条
  • [1] Zero-Shot Cross-Lingual Named Entity Recognition via Progressive Multi-Teacher Distillation
    Li, Zhuoran
    Hu, Chunming
    Zhang, Richong
    Chen, Junfan
    Guo, Xiaohui
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 4617 - 4630
  • [2] Cross-View Gait Recognition Method Based on Multi-Teacher Joint Knowledge Distillation
    Li, Ruoyu
    Yun, Lijun
    Zhang, Mingxuan
    Yang, Yanchen
    Cheng, Feiyan
    SENSORS, 2023, 23 (22)
  • [3] Decoupled Multi-teacher Knowledge Distillation based on Entropy
    Cheng, Xin
    Tang, Jialiang
    Zhang, Zhiqiang
    Yu, Wenxin
    Jiang, Ning
    Zhou, Jinjia
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [4] Anomaly detection based on multi-teacher knowledge distillation
    Ma, Ye
    Jiang, Xu
    Guan, Nan
    Yi, Wang
    JOURNAL OF SYSTEMS ARCHITECTURE, 2023, 138
  • [5] Multi-Grained Knowledge Distillation for Named Entity Recognition
    Zhou, Xuan
    Zhang, Xiao
    Tao, Chenyang
    Chen, Junya
    Xu, Bing
    Wang, Wei
    Xiao, Jing
    2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 5704 - 5716
  • [6] Multi-teacher knowledge distillation for compressed video action recognition based on deep learning
    Wu, Meng-Chieh
    Chiu, Ching-Te
    JOURNAL OF SYSTEMS ARCHITECTURE, 2020, 103
  • [7] Correlation Guided Multi-teacher Knowledge Distillation
    Shi, Luyao
    Jiang, Ning
    Tang, Jialiang
    Huang, Xinlei
    NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 562 - 574
  • [8] Reinforced Multi-Teacher Selection for Knowledge Distillation
    Yuan, Fei
    Shou, Linjun
    Pei, Jian
    Lin, Wutao
    Gong, Ming
    Fu, Yan
    Jiang, Daxin
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14284 - 14291
  • [9] Faster biomedical named entity recognition based on knowledge distillation
    Hu B.
    Geng T.
    Deng G.
    Duan L.
    Qinghua Daxue Xuebao/Journal of Tsinghua University, 2021, 61 (09): : 936 - 942
  • [10] mKDNAD: A network flow anomaly detection method based on multi-teacher knowledge distillation
    Yang, Yang
    Liu, Dan
    2022 16TH IEEE INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING (ICSP2022), VOL 1, 2022, : 314 - 319