Improving knowledge distillation via pseudo-multi-teacher network

被引:0
|
作者
Li, Shunhang [1 ]
Shao, Mingwen [1 ]
Guo, Zihao [1 ]
Zhuang, Xinkai [1 ]
机构
[1] China Univ Petr, Coll Comp Sci & Technol, Changjiang Rd, Qingdao 266580, Shandong, Peoples R China
基金
中国国家自然科学基金;
关键词
Convolutional neural networks; Knowledge distillation; Online distillation; Mutual learning;
D O I
10.1007/s00138-023-01383-5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing knowledge distillation methods usually directly push the student model to imitate the features or probabilities of the teacher model. However, the knowledge capacity of teachers limits students to learn undiscovered knowledge. To address this issue, we propose a pseudo-multi-teacher knowledge distillation method to augment the learning of undiscovered knowledge. Specifically, we propose a well-designed auxiliary classifier to capture semantic information in cross-layer that enables our network to obtain more abundant supervised information. Besides, we propose an ensemble module to combine the feature maps of each sub-network, which generates a more significant ensemble of features to guide the network. Furthermore, the auxiliary classifier and ensemble module are discarded after training, and thus there are no additional parameters introduced to the final model. Comprehensive experiments on benchmark datasets demonstrate the effectiveness of our proposed method.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Improving knowledge distillation via pseudo-multi-teacher network
    Shunhang Li
    Mingwen Shao
    Zihao Guo
    Xinkai Zhuang
    [J]. Machine Vision and Applications, 2023, 34
  • [2] Improving knowledge distillation via an expressive teacher
    Tan, Chao
    Liu, Jie
    Zhang, Xiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [3] Knowledge Distillation via Multi-Teacher Feature Ensemble
    Ye, Xin
    Jiang, Rongxin
    Tian, Xiang
    Zhang, Rui
    Chen, Yaowu
    [J]. IEEE Signal Processing Letters, 2024, 31 : 566 - 570
  • [4] Knowledge Distillation via Multi-Teacher Feature Ensemble
    Ye, Xin
    Jiang, Rongxin
    Tian, Xiang
    Zhang, Rui
    Chen, Yaowu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 566 - 570
  • [5] Improving Knowledge Distillation With a Customized Teacher
    Tan, Chao
    Liu, Jie
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (02) : 2290 - 2299
  • [6] Improved Knowledge Distillation via Teacher Assistant
    Mirzadeh, Seyed Iman
    Farajtabar, Mehrdad
    Li, Ang
    Levine, Nir
    Matsukawa, Akihiro
    Ghasemzadeh, Hassan
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5191 - 5198
  • [7] Reinforced Multi-Teacher Selection for Knowledge Distillation
    Yuan, Fei
    Shou, Linjun
    Pei, Jian
    Lin, Wutao
    Gong, Ming
    Fu, Yan
    Jiang, Daxin
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 14284 - 14291
  • [8] Correlation Guided Multi-teacher Knowledge Distillation
    Shi, Luyao
    Jiang, Ning
    Tang, Jialiang
    Huang, Xinlei
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT IV, 2024, 14450 : 562 - 574
  • [9] Adaptive multi-teacher multi-level knowledge distillation
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    [J]. NEUROCOMPUTING, 2020, 415 : 106 - 113
  • [10] Adaptive multi-teacher multi-level knowledge distillation
    Liu, Yuang
    Zhang, Wei
    Wang, Jun
    [J]. Neurocomputing, 2021, 415 : 106 - 113