DM-KD: Decoupling Mixed-Images for Efficient Knowledge Distillation

被引:0
|
作者
Im, Jongkyung [1 ]
Jang, Younho [2 ]
Lim, Junpyo [1 ]
Kang, Taegoo [1 ]
Zhang, Chaoning [2 ]
Bae, Sung-Ho [2 ]
机构
[1] Kyung Hee Univ, Dept Artificial Intelligence, Yongin 17104, South Korea
[2] Kyung Hee Univ, Dept Comp Sci & Engn, Yongin 17104, South Korea
来源
IEEE ACCESS | 2025年 / 13卷
关键词
Predictive models; Computational modeling; Entropy; Data augmentation; Quantization (signal); Probability distribution; Uncertainty; Feeds; Degradation; Data models; Deep learning; knowledge distillation; augmentation; decoupling; CutMix; MixUp;
D O I
10.1109/ACCESS.2024.3524734
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Knowledge distillation (KD) is a method of model compression. It involves extracting valuable knowledge from a high-performance and high-capacity teacher model and transferring this knowledge to a target student model having relatively small capacity. However, we discover that naively applying mixed-augmentation to KD impacts the student model's learning through knowledge distillation negatively. We analyze this side-effect of mix-augmentation when using knowledge distillation and propose a new method that addresses this problem. That is, we found that mixed images tend to make the teacher generate unstable and poor-quality logits which hinder transferring the knowledge. To solve this problem, we decouple an input mixed image into two original images and feed them into the teacher model individually. After then we interpolate the two individual logits to generate a logit for KD. For the student, the mixed image is still used as input of the student. This decoupling strategy allows the stability of logit distributions of the teacher, thus resulting in higher KD performance with mixed augmentation. To verify the effectiveness of the proposed method, we experiment on various datasets and mixed augmentation methods, demonstrating that the proposed method showed 0.31%-0.69% improvement in top-1 accuracy compared to the original KD method on theImageNet dataset.
引用
收藏
页码:10527 / 10534
页数:8
相关论文
共 10 条
  • [1] Pea-KD: Parameter-efficient and accurate Knowledge Distillation on BERT
    Cho, Ikhyun
    Kang, U.
    PLOS ONE, 2022, 17 (02):
  • [2] Efficient Scene Text Detection in Images with Network Pruning and Knowledge Distillation
    Orenbas, Halit
    Oymagil, Anil
    Baydar, Melih
    29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
  • [3] KD-SegNet: Efficient Semantic Segmentation Network with Knowledge Distillation Based on Monocular Camera
    Dang, Thai-Viet
    Bui, Nhu-Nghia
    Tan, Phan Xuan
    CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 2001 - 2026
  • [4] MSTNet-KD: Multilevel Transfer Networks Using Knowledge Distillation for the Dense Prediction of Remote-Sensing Images
    Zhou, Wujie
    Li, Yangzhen
    Huan, Juan
    Liu, Yuanyuan
    Jiang, Qiuping
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
  • [5] EPANet-KD: Efficient progressive attention network for fine-grained provincial village classification via knowledge distillation
    Zhang, Cheng
    Liu, Chunqing
    Gong, Huimin
    Teng, Jinlin
    PLOS ONE, 2024, 19 (02):
  • [6] SSD-KD: A self-supervised diverse knowledge distillation method for lightweight skin lesion classification using dermoscopic images
    Wang, Yongwei
    Wang, Yuheng
    Cai, Jiayue
    Lee, Tim K.
    Miao, Chunyan
    Wang, Z. Jane
    MEDICAL IMAGE ANALYSIS, 2023, 84
  • [7] KD-PAR: A knowledge distillation-based pedestrian attribute recognition model with multi-label mixed feature learning network
    Wu, Peishu
    Wang, Zidong
    Li, Han
    Zeng, Nianyin
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
  • [8] Efficient Multi-Organ Segmentation From 3D Abdominal CT Images With Lightweight Network and Knowledge Distillation
    Zhao, Qianfei
    Zhong, Lanfeng
    Xiao, Jianghong
    Zhang, Jingbo
    Chen, Yinan
    Liao, Wenjun
    Zhang, Shaoting
    Wang, Guotai
    IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (09) : 2513 - 2523
  • [9] Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting
    Wang, Liuqian
    Zhang, Jing
    Tian, Jimiao
    Li, Jiafeng
    Zhuo, Li
    Tian, Qi
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [10] Efficient knowledge distillation for hybrid models: A vision transformer-convolutional neural network to convolutional neural network approach for classifying remote sensing images
    Song, Huaxiang
    Yuan, Yuxuan
    Ouyang, Zhiwei
    Yang, Yu
    Xiang, Hui
    IET CYBER-SYSTEMS AND ROBOTICS, 2024, 6 (03)