DIFFERENTIABLE DYNAMIC CHANNEL ASSOCIATION FOR KNOWLEDGE DISTILLATION

被引：0

作者：

Tang, Qiankun ^{[1
]}

Xu, Xiaogang ^{[1
,2
]}

Wang, Jun ^{[1
]}

机构：

[1] Zhejiang Lab, Inst Artificial Intelligence, Hangzhou, Peoples R China

[2] Zhejiang Gongshang Univ, Sch Comp & Informat Engn, Hangzhou, Peoples R China

来源：

2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年

关键词：

knowledge distillation; dynamic channel association; weighted distillation;

D O I：

10.1109/ICIP42928.2021.9506271

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Knowledge distillation is an effective model compression technology, which encourages a small student model to mimic the features or probabilistic outputs of a large teacher model. Existing feature-based distillation methods mainly focus on formulating enriched representations, while naively address the channel dimension gap and adopt the handcrafted channel association strategy between teacher and student for distillation. This not only introduces more parameters and computational cost, but may transfer irrelevant information to student. In this paper, we present a differentiable and efficient Dynamic Channel Association (DCA) mechanism, which automatically associates proper teacher channels for each student channel. DCA also enables each student channel to distill knowledge from multiple teacher channels in a weighted manner. Extensive experiments on classification task, with various combinations of network architectures for teacher and student models, well demonstrate the effectiveness of our proposed approach.

引用

页码：414 / 418

页数：5

共 50 条

[41] Knowledge Distillation Applied to Optical Channel Equalization: Solving the Parallelization Problem of Recurrent Connection
Srivallapanondh, Sasipim
Freire, Pedro J.
Spinnler, Bernhard
Costa, Nelson
Napoli, Antonio
Turitsyn, Sergei K.
Prilepsky, Jaroslaw E.
2023 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION, OFC, 2023,
[42] CPKD: Channel and Position-wise Knowledge Distillation for Segmentation of Road Negative Obstacles
Feng, Zhen
Guo, Yanning
Sun, Yuxiang
2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 3110 - 3115
[43] Fine-Tuning Channel-Pruned Deep Model via Knowledge Distillation
Zhang, Chong
Wang, Hong-Zhi
Liu, Hong-Wei
Chen, Yi-Lin
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (06) : 1238 - 1247
[44] Iterative Transfer Knowledge Distillation and Channel Pruning for Unsupervised Cross-Domain Compression
Wang, Zhiyuan
Shi, Long
Mei, Zhen
Zhao, Xiang
Wang, Zhe
Li, Jun
WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 3 - 15
[45] On the Efficacy of Knowledge Distillation
Cho, Jang Hyun
Hariharan, Bharath
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4793 - 4801
[46] Annealing Knowledge Distillation
Jafari, Aref
Rezagholizadeh, Mehdi
Sharma, Pranav
Ghodsi, Ali
16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2493 - 2504
[47] Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
Park, Sangyong
Heo, Yong Seok
SENSORS, 2020, 20 (16) : 1 - 19
[48] Knowledge Distillation: A Survey
Gou, Jianping
Yu, Baosheng
Maybank, Stephen J.
Tao, Dacheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1789 - 1819
[49] Knowledge Condensation Distillation
Li, Chenxin
Lin, Mingbao
Ding, Zhiyuan
Lin, Nie
Zhuang, Yihong
Huang, Yue
Ding, Xinghao
Cao, Liujuan
COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 19 - 35
[50] Relational Knowledge Distillation
Park, Wonpyo
Kim, Dongju
Lu, Yan
Cho, Minsu
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3962 - 3971

← 1 2 3 4 5 →