Compact CNN Structure Learning by Knowledge Distillation

被引:3
|
作者
Ahmed, Waqar [1 ,2 ]
Zunino, Andrea [3 ]
Morerio, Pietro [1 ]
Murino, Vittorio [1 ,3 ,4 ]
机构
[1] Ist Italiano Tecnol, Pattern Anal & Comp Vis PAVIS, Genoa, Italy
[2] Univ Genoa, Dipartimento Ingn Navale Elettr Elettron & Teleco, Genoa, Italy
[3] Huawei Technol Co Ltd, Ireland Res Ctr, Dublin, Ireland
[4] Univ Verona, Dipartimento Informat, Verona, Italy
关键词
D O I
10.1109/ICPR48806.2021.9413006
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The concept of compressing deep Convolutional Neural Networks (CNNs) is essential to use limited computation, power, and memory resources on embedded devices. However, existing methods achieve this objective at the cost of a drop in inference accuracy in computer vision tasks. To address such a drawback, we propose a framework that leverages knowledge distillation along with customizable block-wise optimization to learn a lightweight CNN structure while preserving better control over the compression-performance tradeoff. Considering specific resource constraints, e.g., floating-point operations per inference (FLOPs) or model-parameters, our method results in a state of the art network compression while being capable of achieving better inference accuracy. In a comprehensive evaluation, we demonstrate that our method is effective, robust, and consistent with results over a variety of network architectures and datasets, at negligible training overhead. In particular, for the already compact network MobileNet_v2, our method offers up to 2x and 5.2x better model compression in terms of FLOPs and model-parameters, respectively, while getting 1.05% better model performance than the baseline network.
引用
收藏
页码:6554 / 6561
页数:8
相关论文
共 50 条
  • [1] Knowledge Distillation based Compact Model Learning Method for Object Detection
    Ko, Jong Gook
    Yoo, Wonyoung
    11TH INTERNATIONAL CONFERENCE ON ICT CONVERGENCE: DATA, NETWORK, AND AI IN THE AGE OF UNTACT (ICTC 2020), 2020, : 1276 - 1278
  • [2] VARIATIONAL STUDENT: LEARNING COMPACT AND SPARSER NETWORKS IN KNOWLEDGE DISTILLATION FRAMEWORK
    Hegde, Srinidhi
    Prasad, Ranjitha
    Hebbalaguppe, Ramya
    Kumar, Vishwajeet
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3247 - 3251
  • [3] Adversarial Knowledge Distillation for a Compact Generator
    Tsunashima, Hideki
    Kataoka, Hirokatsu
    Yamato, Junji
    Chen, Qiu
    Morishima, Shigeo
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10636 - 10643
  • [4] Combining Weight Pruning and Knowledge Distillation For CNN Compression
    Aghli, Nima
    Ribeiro, Eraldo
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 3185 - 3192
  • [5] Lightweight intrusion detection model based on CNN and knowledge distillation
    Wang, Long-Hui
    Dai, Qi
    Du, Tony
    Chen, Li-fang
    APPLIED SOFT COMPUTING, 2024, 165
  • [6] Joint architecture and knowledge distillation in CNN for Chinese text recognition
    Wang, Zi-Rui
    Du, Jun
    PATTERN RECOGNITION, 2021, 111
  • [7] Knowledge distillation vulnerability of DeiT through CNN adversarial attack
    Hong, Inpyo
    Choi, Chang
    NEURAL COMPUTING & APPLICATIONS, 2023, 37 (12): : 7721 - 7731
  • [8] Compact Models for Periocular Verification Through Knowledge Distillation
    Boutros, Fadi
    Damer, Naser
    Fang, Meiling
    Raja, Kiran
    Kirchbuchner, Florian
    Kuijper, Arjan
    2020 INTERNATIONAL CONFERENCE OF THE BIOMETRICS SPECIAL INTEREST GROUP (BIOSIG), 2020, P-306
  • [9] Block change learning for knowledge distillation
    Choi, Hyunguk
    Lee, Younkwan
    Yow, Kin Choong
    Jeon, Moongu
    INFORMATION SCIENCES, 2020, 513 (513) : 360 - 371
  • [10] Skill enhancement learning with knowledge distillation
    Liu, Naijun
    Sun, Fuchun
    Fang, Bin
    Liu, Huaping
    SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (08)