DIFFERENTIABLE DYNAMIC CHANNEL ASSOCIATION FOR KNOWLEDGE DISTILLATION

被引:0
|
作者
Tang, Qiankun [1 ]
Xu, Xiaogang [1 ,2 ]
Wang, Jun [1 ]
机构
[1] Zhejiang Lab, Inst Artificial Intelligence, Hangzhou, Peoples R China
[2] Zhejiang Gongshang Univ, Sch Comp & Informat Engn, Hangzhou, Peoples R China
关键词
knowledge distillation; dynamic channel association; weighted distillation;
D O I
10.1109/ICIP42928.2021.9506271
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Knowledge distillation is an effective model compression technology, which encourages a small student model to mimic the features or probabilistic outputs of a large teacher model. Existing feature-based distillation methods mainly focus on formulating enriched representations, while naively address the channel dimension gap and adopt the handcrafted channel association strategy between teacher and student for distillation. This not only introduces more parameters and computational cost, but may transfer irrelevant information to student. In this paper, we present a differentiable and efficient Dynamic Channel Association (DCA) mechanism, which automatically associates proper teacher channels for each student channel. DCA also enables each student channel to distill knowledge from multiple teacher channels in a weighted manner. Extensive experiments on classification task, with various combinations of network architectures for teacher and student models, well demonstrate the effectiveness of our proposed approach.
引用
收藏
页码:414 / 418
页数:5
相关论文
共 50 条
  • [1] DDK: Dynamic structure pruning based on differentiable search and recursive knowledge distillation for BERT
    Zhang, Zhou
    Lu, Yang
    Wang, Tengfei
    Wei, Xing
    Wei, Zhen
    NEURAL NETWORKS, 2024, 173
  • [2] Channel Affinity Knowledge Distillation for Semantic Segmentation
    Li, Huakun
    Zhang, Yuhang
    Tian, Shishun
    Cheng, Pengfei
    You, Rong
    Zou, Wenbin
    2023 IEEE 25TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING, MMSP, 2023,
  • [3] Knowledge Distillation via Channel Correlation Structure
    Li, Bo
    Chen, Bin
    Wang, Yunxiao
    Dai, Tao
    Hu, Maowei
    Jiang, Yong
    Xia, Shutao
    KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, PT I, 2021, 12815 : 357 - 368
  • [4] Dynamic Knowledge Distillation with Cross-Modality Knowledge Transfer
    Wang, Guangzhi
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 2974 - 2978
  • [5] Channel-wise Knowledge Distillation for Dense Prediction
    Shu, Changyong
    Liu, Yifan
    Gao, Jianfei
    Yan, Zheng
    Shen, Chunhua
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 5291 - 5300
  • [6] Channel-Correlation-Based Selective Knowledge Distillation
    Gou, Jianping
    Xiong, Xiangshuo
    Yu, Baosheng
    Zhan, Yibing
    Yi, Zhang
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2023, 15 (03) : 1574 - 1585
  • [7] DD-YOLO: An object detection method combining knowledge distillation and Differentiable Architecture Search
    Xing, Zhiqiang
    Chen, Xi
    Pang, Fengqian
    IET COMPUTER VISION, 2022, 16 (05) : 418 - 430
  • [8] Dynamic Refining Knowledge Distillation Based on Attention Mechanism
    Peng, Xuan
    Liu, Fang
    PRICAI 2022: TRENDS IN ARTIFICIAL INTELLIGENCE, PT II, 2022, 13630 : 45 - 58
  • [9] Dynamic Guidance Adversarial Distillation with Enhanced Teacher Knowledge
    Park, Hyejin
    Min, Dongbo
    COMPUTER VISION - ECCV 2024, PT LXXII, 2025, 15130 : 204 - 219
  • [10] A General Dynamic Knowledge Distillation Method for Visual Analytics
    Tu, Zhigang
    Liu, Xiangjian
    Xiao, Xuan
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6517 - 6531