DIFFERENTIABLE DYNAMIC CHANNEL ASSOCIATION FOR KNOWLEDGE DISTILLATION

被引:0
|
作者
Tang, Qiankun [1 ]
Xu, Xiaogang [1 ,2 ]
Wang, Jun [1 ]
机构
[1] Zhejiang Lab, Inst Artificial Intelligence, Hangzhou, Peoples R China
[2] Zhejiang Gongshang Univ, Sch Comp & Informat Engn, Hangzhou, Peoples R China
来源
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年
关键词
knowledge distillation; dynamic channel association; weighted distillation;
D O I
10.1109/ICIP42928.2021.9506271
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation is an effective model compression technology, which encourages a small student model to mimic the features or probabilistic outputs of a large teacher model. Existing feature-based distillation methods mainly focus on formulating enriched representations, while naively address the channel dimension gap and adopt the handcrafted channel association strategy between teacher and student for distillation. This not only introduces more parameters and computational cost, but may transfer irrelevant information to student. In this paper, we present a differentiable and efficient Dynamic Channel Association (DCA) mechanism, which automatically associates proper teacher channels for each student channel. DCA also enables each student channel to distill knowledge from multiple teacher channels in a weighted manner. Extensive experiments on classification task, with various combinations of network architectures for teacher and student models, well demonstrate the effectiveness of our proposed approach.
引用
收藏
页码:414 / 418
页数:5
相关论文
共 50 条
  • [41] Knowledge Distillation Applied to Optical Channel Equalization: Solving the Parallelization Problem of Recurrent Connection
    Srivallapanondh, Sasipim
    Freire, Pedro J.
    Spinnler, Bernhard
    Costa, Nelson
    Napoli, Antonio
    Turitsyn, Sergei K.
    Prilepsky, Jaroslaw E.
    2023 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXHIBITION, OFC, 2023,
  • [42] CPKD: Channel and Position-wise Knowledge Distillation for Segmentation of Road Negative Obstacles
    Feng, Zhen
    Guo, Yanning
    Sun, Yuxiang
    2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC, 2023, : 3110 - 3115
  • [43] Fine-Tuning Channel-Pruned Deep Model via Knowledge Distillation
    Zhang, Chong
    Wang, Hong-Zhi
    Liu, Hong-Wei
    Chen, Yi-Lin
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2024, 39 (06) : 1238 - 1247
  • [44] Iterative Transfer Knowledge Distillation and Channel Pruning for Unsupervised Cross-Domain Compression
    Wang, Zhiyuan
    Shi, Long
    Mei, Zhen
    Zhao, Xiang
    Wang, Zhe
    Li, Jun
    WEB INFORMATION SYSTEMS AND APPLICATIONS, WISA 2024, 2024, 14883 : 3 - 15
  • [45] On the Efficacy of Knowledge Distillation
    Cho, Jang Hyun
    Hariharan, Bharath
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 4793 - 4801
  • [46] Annealing Knowledge Distillation
    Jafari, Aref
    Rezagholizadeh, Mehdi
    Sharma, Pranav
    Ghodsi, Ali
    16TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EACL 2021), 2021, : 2493 - 2504
  • [47] Knowledge Distillation for Semantic Segmentation Using Channel and Spatial Correlations and Adaptive Cross Entropy
    Park, Sangyong
    Heo, Yong Seok
    SENSORS, 2020, 20 (16) : 1 - 19
  • [48] Knowledge Distillation: A Survey
    Gou, Jianping
    Yu, Baosheng
    Maybank, Stephen J.
    Tao, Dacheng
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (06) : 1789 - 1819
  • [49] Knowledge Condensation Distillation
    Li, Chenxin
    Lin, Mingbao
    Ding, Zhiyuan
    Lin, Nie
    Zhuang, Yihong
    Huang, Yue
    Ding, Xinghao
    Cao, Liujuan
    COMPUTER VISION, ECCV 2022, PT XI, 2022, 13671 : 19 - 35
  • [50] Relational Knowledge Distillation
    Park, Wonpyo
    Kim, Dongju
    Lu, Yan
    Cho, Minsu
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3962 - 3971