DM-KD: Decoupling Mixed-Images for Efficient Knowledge Distillation

被引：0

作者：

Im, Jongkyung ^{[1
]}

Jang, Younho ^{[2
]}

Lim, Junpyo ^{[1
]}

Kang, Taegoo ^{[1
]}

Zhang, Chaoning ^{[2
]}

Bae, Sung-Ho ^{[2
]}

机构：

[1] Kyung Hee Univ, Dept Artificial Intelligence, Yongin 17104, South Korea

[2] Kyung Hee Univ, Dept Comp Sci & Engn, Yongin 17104, South Korea

来源：

IEEE ACCESS | 2025年 / 13卷

关键词：

Predictive models; Computational modeling; Entropy; Data augmentation; Quantization (signal); Probability distribution; Uncertainty; Feeds; Degradation; Data models; Deep learning; knowledge distillation; augmentation; decoupling; CutMix; MixUp;

D O I：

10.1109/ACCESS.2024.3524734

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Knowledge distillation (KD) is a method of model compression. It involves extracting valuable knowledge from a high-performance and high-capacity teacher model and transferring this knowledge to a target student model having relatively small capacity. However, we discover that naively applying mixed-augmentation to KD impacts the student model's learning through knowledge distillation negatively. We analyze this side-effect of mix-augmentation when using knowledge distillation and propose a new method that addresses this problem. That is, we found that mixed images tend to make the teacher generate unstable and poor-quality logits which hinder transferring the knowledge. To solve this problem, we decouple an input mixed image into two original images and feed them into the teacher model individually. After then we interpolate the two individual logits to generate a logit for KD. For the student, the mixed image is still used as input of the student. This decoupling strategy allows the stability of logit distributions of the teacher, thus resulting in higher KD performance with mixed augmentation. To verify the effectiveness of the proposed method, we experiment on various datasets and mixed augmentation methods, demonstrating that the proposed method showed 0.31%-0.69% improvement in top-1 accuracy compared to the original KD method on theImageNet dataset.

引用

页码：10527 / 10534

页数：8

共 10 条

[1] Pea-KD: Parameter-efficient and accurate Knowledge Distillation on BERT
Cho, Ikhyun
Kang, U.
PLOS ONE, 2022, 17 (02):
[2] Efficient Scene Text Detection in Images with Network Pruning and Knowledge Distillation
Orenbas, Halit
Oymagil, Anil
Baydar, Melih
29TH IEEE CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS (SIU 2021), 2021,
[3] KD-SegNet: Efficient Semantic Segmentation Network with Knowledge Distillation Based on Monocular Camera
Dang, Thai-Viet
Bui, Nhu-Nghia
Tan, Phan Xuan
CMC-COMPUTERS MATERIALS & CONTINUA, 2025, 82 (02): : 2001 - 2026
[4] MSTNet-KD: Multilevel Transfer Networks Using Knowledge Distillation for the Dense Prediction of Remote-Sensing Images
Zhou, Wujie
Li, Yangzhen
Huan, Juan
Liu, Yuanyuan
Jiang, Qiuping
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 12
[5] EPANet-KD: Efficient progressive attention network for fine-grained provincial village classification via knowledge distillation
Zhang, Cheng
Liu, Chunqing
Gong, Huimin
Teng, Jinlin
PLOS ONE, 2024, 19 (02):
[6] SSD-KD: A self-supervised diverse knowledge distillation method for lightweight skin lesion classification using dermoscopic images
Wang, Yongwei
Wang, Yuheng
Cai, Jiayue
Lee, Tim K.
Miao, Chunyan
Wang, Z. Jane
MEDICAL IMAGE ANALYSIS, 2023, 84
[7] KD-PAR: A knowledge distillation-based pedestrian attribute recognition model with multi-label mixed feature learning network
Wu, Peishu
Wang, Zidong
Li, Han
Zeng, Nianyin
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 237
[8] Efficient Multi-Organ Segmentation From 3D Abdominal CT Images With Lightweight Network and Knowledge Distillation
Zhao, Qianfei
Zhong, Lanfeng
Xiao, Jianghong
Zhang, Jingbo
Chen, Yinan
Liao, Wenjun
Zhang, Shaoting
Wang, Guotai
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2023, 42 (09) : 2513 - 2523
[9] Efficient Fine-Grained Object Recognition in High-Resolution Remote Sensing Images From Knowledge Distillation to Filter Grafting
Wang, Liuqian
Zhang, Jing
Tian, Jimiao
Li, Jiafeng
Zhuo, Li
Tian, Qi
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[10] Efficient knowledge distillation for hybrid models: A vision transformer-convolutional neural network to convolutional neural network approach for classifying remote sensing images
Song, Huaxiang
Yuan, Yuxuan
Ouyang, Zhiwei
Yang, Yu
Xiang, Hui
IET CYBER-SYSTEMS AND ROBOTICS, 2024, 6 (03)

← 1 →