Inplace knowledge distillation with teacher assistant for improved training of flexible deep neural networks

被引:3
|
作者
Ozerov, Alexey [1 ]
Duong, Ngoc Q. K. [1 ]
机构
[1] InterDigital R&D France, Cesson Sevigne, France
基金
欧盟地平线“2020”;
关键词
Deep Neural Networks; Flexible Models; Inplace Knowledge Distillation with Teacher Assistant;
D O I
10.23919/EUSIPCO54536.2021.9616244
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Deep neural networks (DNNs) have achieved great success in various machine learning tasks. However, most existing powerful DNN models are computationally expensive and memory demanding, hindering their deployment in devices with low memory and computational resources or in applications with strict latency requirements. Thus, several resource-adaptable or flexible approaches were recently proposed that train at the same time a big model and several resource-specific sub-models. Inplace knowledge distillation (IPKD) became a popular method to train those models and consists in distilling the knowledge from a larger model (teacher) to all other sub-models (students). In this work a novel generic training method called IPKD with teacher assistant (IPKD-TA) is introduced, where sub-models themselves become teacher assistants teaching smaller sub-models. We evaluated the proposed IPKD-TA training method using two state-of-the-art flexible models (MSDNet and Slimmable MobileNet-V1) with two popular image classification benchmarks (CIFAR-10 and CIFAR-100). Our results demonstrate that the IPKD-TA is on par with the existing state of the art while improving it in most cases.
引用
收藏
页码:1356 / 1360
页数:5
相关论文
共 50 条
  • [1] Improved Knowledge Distillation via Teacher Assistant
    Mirzadeh, Seyed Iman
    Farajtabar, Mehrdad
    Li, Ang
    Levine, Nir
    Matsukawa, Akihiro
    Ghasemzadeh, Hassan
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 5191 - 5198
  • [2] MULTI-TEACHER KNOWLEDGE DISTILLATION FOR COMPRESSED VIDEO ACTION RECOGNITION ON DEEP NEURAL NETWORKS
    Wu, Meng-Chieh
    Chiu, Ching-Te
    Wu, Kun-Hsuan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 2202 - 2206
  • [3] Knowledge Distillation for Optimization of Quantized Deep Neural Networks
    Shin, Sungho
    Boo, Yoonho
    Sung, Wonyong
    2020 IEEE WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2020, : 111 - 116
  • [4] Improving the Interpretability of Deep Neural Networks with Knowledge Distillation
    Liu, Xuan
    Wang, Xiaoguang
    Matwin, Stan
    2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 905 - 912
  • [5] Homogeneous teacher based buffer knowledge distillation for tiny neural networks
    Dai, Xinru
    Lu, Gang
    Shen, Jianhua
    Huang, Shuo
    Wei, Tongquan
    JOURNAL OF SYSTEMS ARCHITECTURE, 2024, 148
  • [6] Constructing Deep Spiking Neural Networks from Artificial Neural Networks with Knowledge Distillation
    Xu, Qi
    Li, Yaxin
    Shen, Jiangrong
    Liu, Jian K.
    Tang, Huajin
    Pan, Gang
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7886 - 7895
  • [7] Channel Planting for Deep Neural Networks using Knowledge Distillation
    Mitsuno, Kakeru
    Nomura, Yuichiro
    Kurita, Takio
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 7573 - 7579
  • [8] Soft Hybrid Knowledge Distillation against deep neural networks
    Zhang, Jian
    Tao, Ze
    Zhang, Shichao
    Qiao, Zike
    Guo, Kehua
    NEUROCOMPUTING, 2024, 570
  • [9] Simplified Knowledge Distillation for Deep Neural Networks Bridging the Performance Gap with a Novel Teacher-Student Architecture
    Umirzakova, Sabina
    Abdullaev, Mirjamol
    Mardieva, Sevara
    Latipova, Nodira
    Muksimova, Shakhnoza
    ELECTRONICS, 2024, 13 (22)
  • [10] Collaborative Multi-Teacher Knowledge Distillation for Learning Low Bit-width Deep Neural Networks
    Cuong Pham
    Tuan Hoang
    Thanh-Toan Do
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6424 - 6432