Improving Knowledge Distillation With a Customized Teacher

被引:5
|
作者
Tan, Chao [1 ,2 ]
Liu, Jie [1 ,2 ]
机构
[1] Natl Univ Def Technol, Sci & Parallel & Distributed Proc Lab, Changsha 410073, Peoples R China
[2] Natl Univ Def Technol, Lab Software Engn Complex Syst, Changsha 410073, Peoples R China
关键词
Knowledge distillation (KD); knowledge transfer; neural network acceleration; neural network compression;
D O I
10.1109/TNNLS.2022.3189680
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation (KD) is a widely used approach to transfer knowledge from a cumbersome network (also known as a teacher) to a lightweight network (also known as a student). However, even though the accuracies of different teachers are similar, the fixed student's accuracies are significantly different. We find that teachers with more dispersed secondary soft probabilities are more qualified to play their roles. Therefore, an indicator, i.e., the standard deviation sigma of secondary soft probabilities, is introduced to choose the teacher. Moreover, to make a teacher's secondary soft probabilities more dispersed, a novel method, dubbed pretraining the teacher under dual supervision (PTDS), is proposed to pretrain a teacher under dual supervision. In addition, we put forward an asymmetrical transformation function (ATF) to further enhance the dispersion degree of the pretrained teachers' secondary soft probabilities. The combination of PTDS and ATF is termed knowledge distillation with a customized teacher (KDCT). Extensive empirical experiments and analyses are conducted on three computer vision tasks, including image classification, transfer learning, and semantic segmentation, to substantiate the effectiveness of KDCT.
引用
收藏
页码:2290 / 2299
页数:10
相关论文
共 50 条
  • [1] Improving knowledge distillation via an expressive teacher
    Tan, Chao
    Liu, Jie
    Zhang, Xiang
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 218
  • [2] Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher
    Zhu, Yichen
    Wang, Yi
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5037 - 5046
  • [3] Recruiting the Best Teacher Modality: A Customized Knowledge Distillation Method for if Based Nephropathy Diagnosis
    Dai, Ning
    Jiang, Lai
    Fu, Yibing
    Pan, Sai
    Xu, Mai
    Deng, Xin
    Chen, Pu
    Chen, Xiangmei
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT V, 2023, 14224 : 526 - 536
  • [4] Improving knowledge distillation via pseudo-multi-teacher network
    Li, Shunhang
    Shao, Mingwen
    Guo, Zihao
    Zhuang, Xinkai
    [J]. MACHINE VISION AND APPLICATIONS, 2023, 34 (02)
  • [5] Improving knowledge distillation via pseudo-multi-teacher network
    Shunhang Li
    Mingwen Shao
    Zihao Guo
    Xinkai Zhuang
    [J]. Machine Vision and Applications, 2023, 34
  • [6] PURF: Improving teacher representations by imposing smoothness constraints for knowledge distillation
    Hossain, Md Imtiaz
    Akhter, Sharmen
    Hong, Choong Seon
    Huh, Eui-Nam
    [J]. APPLIED SOFT COMPUTING, 2024, 159
  • [7] Adapt Your Teacher: Improving Knowledge Distillation for Exemplar-free Continual Learning
    Szatkowski, Filip
    Pyla, Mateusz
    Przewiezlikowski, Marcin
    Cygert, Sebastian
    Twardowski, Bartlomiej
    Trzcinski, Tomasz
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 3504 - 3509
  • [8] Knowledge Distillation with the Reused Teacher Classifier
    Chen, Defang
    Mei, Jian-Ping
    Zhang, Hailin
    Wang, Can
    Feng, Yan
    Chen, Chun
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11923 - 11932
  • [9] Knowledge Distillation from A Stronger Teacher
    Huang, Tao
    You, Shan
    Wang, Fei
    Qian, Chen
    Xu, Chang
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35, NEURIPS 2022, 2022,
  • [10] Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation
    Zhang, Xinlu
    Li, Xiao
    Yang, Yating
    Dong, Rui
    [J]. IEEE ACCESS, 2020, 8 : 206638 - 206645