Decoupled Multi-teacher Knowledge Distillation based on Entropy

被引:0
|
作者
Cheng, Xin [1 ]
Tang, Jialiang [2 ]
Zhang, Zhiqiang [3 ]
Yu, Wenxin [3 ]
Jiang, Ning [3 ]
Zhou, Jinjia [1 ]
机构
[1] Hosei Univ, Grad Sch Sci & Engn, Tokyo, Japan
[2] Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing, Peoples R China
[3] Southwest Univ Sci & Technol, Sch Comp Sci & Technol, Mianyang, Sichuan, Peoples R China
关键词
Multi-teacher knowledge distillation; image classification; entropy; deep learning;
D O I
10.1109/ISCAS58744.2024.10558141
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multi-teacher knowledge distillation (MKD) aims to leverage the valuable and diverse knowledge presented by multiple teacher networks to improve the performance of the student network. Existing approaches typically rely on simple methods such as averaging the prediction logits or using sub-optimal weighting strategies to combine knowledge from multiple teachers. However, employing these techniques cannot fully reflect the importance of teachers and may even mislead student's learning. To address these issues, we propose a novel Decoupled Multi teacher Knowledge Distillation based on Entropy (DE-MKD). DE-MKD decomposes the vanilla KD loss and assigns weights to each teacher to reflect its importance based on the entropy of their predictions. Furthermore, we extend the proposed approach to distill the intermediate features from teachers to further improve the performance of the student network. Extensive experiments conducted on the publicly available CIFAR-100 image classification dataset demonstrate the effectiveness and flexibility of our proposed approach.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Adaptive multi-teacher softened relational knowledge distillation framework for payload mismatch in image steganalysis
    Yu, Lifang
    Li, Yunwei
    Weng, Shaowei
    Tian, Huawei
    Liu, Jing
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [32] Multi-Teacher Distillation With Single Model for Neural Machine Translation
    Liang, Xiaobo
    Wu, Lijun
    Li, Juntao
    Qin, Tao
    Zhang, Min
    Liu, Tie-Yan
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 992 - 1002
  • [33] Multi-teacher Universal Distillation Based on Information Hiding for Defense Against Facial Manipulation
    Li, Xin
    Ni, Rongrong
    Zhao, Yao
    Ni, Yu
    Li, Haoliang
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 5293 - 5307
  • [34] Learning Lightweight Object Detectors via Multi-Teacher Progressive Distillation
    Cao, Shengcao
    Li, Mengtian
    Hays, James
    Ramanan, Deva
    Wang, Yu-Xiong
    Gui, Liang-Yan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 202, 2023, 202
  • [35] MTUW-GAN: A Multi-Teacher Knowledge Distillation Generative Adversarial Network for Underwater Image Enhancement
    Zhang, Tianchi
    Liu, Yuxuan
    Mase, Atsushi
    APPLIED SCIENCES-BASEL, 2024, 14 (02):
  • [36] Model Compression with Two-stage Multi-teacher Knowledge Distillation for Web Question Answering System
    Yang, Ze
    Shou, Linjun
    Gong, Ming
    Lin, Wutao
    Jiang, Daxin
    PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 690 - 698
  • [37] Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
    Cuong Pham
    Tuan Hoang
    Thanh-Toan Do
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6424 - 6432
  • [38] Data-Free Low-Bit Quantization via Dynamic Multi-teacher Knowledge Distillation
    Huang, Chong
    Lin, Shaohui
    Zhang, Yan
    Li, Ke
    Zhang, Baochang
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT VIII, 2024, 14432 : 28 - 41
  • [39] Let All Be Whitened: Multi-Teacher Distillation for Efficient Visual Retrieval
    Ma, Zhe
    Dong, Jianfeng
    Ji, Shouling
    Liu, Zhenguang
    Zhang, Xuhong
    Wang, Zonghui
    He, Sifeng
    Qian, Feng
    Zhang, Xiaobo
    Yang, Lei
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4126 - 4135
  • [40] Adversarial Multi-Teacher Distillation for Semi-Supervised Relation Extraction
    Li, Wanli
    Qian, Tieyun
    Li, Xuhui
    Zou, Lixin
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11291 - 11301