Deep neural networks compression: A comparative survey and choice recommendations

被引:25
|
作者
Marino, Giosue Cataldo [1 ]
Petrini, Alessandro [1 ]
Malchiodi, Dario [1 ]
Frasca, Marco [1 ]
机构
[1] Univ Milan, Dipartimento Informat, Via Celoria 18, I-20133 Milan, Italy
关键词
CNN compression; Connection pruning; Weight quantization; Weight sharing; Huffman coding; Succinct Deep Neural Networks; WEIGHTS;
D O I
10.1016/j.neucom.2022.11.072
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The state-of-the-art performance for several real-world problems is currently reached by deep and, in particular, convolutional neural networks (CNN). Such learning models exploit recent results in the field of deep learning, leading to highly performing, yet very large neural networks with typically millions to billions of parameters. As a result, such models are often redundant and excessively oversized, with a detrimental effect on the environment in terms of unnecessary energy consumption and a limitation to their deployment on low-resource devices. The necessity for compression techniques able to reduce the number of model parameters and their resource demand is thereby increasingly felt by the research community. In this paper we propose the first extensive comparison, to the best of our knowledge, of the main lossy and structure-preserving approaches to compress pre-trained CNNs, applicable in principle to any existing model. Our study is intended to provide a first and preliminary guidance to choose the most suitable compression technique when there is the need to reduce the occupancy of pre-trained models. Both convolutional and fully-connected layers are included in the analysis. Our experiments involved two pre-trained state-of-the-art CNNs (proposed to solve classification or regression problems) and five benchmarks, and gave rise to important insights about the applicability and performance of such tech-niques w.r.t. the type of layer to be compressed and the category of problem tackled.(c) 2022 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY license (http:// creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:152 / 170
页数:19
相关论文
共 50 条
  • [21] A survey of quantization methods for deep neural networks
    Yang C.
    Zhang R.
    Huang L.
    Ti S.
    Lin J.
    Dong Z.
    Chen S.
    Liu Y.
    Yin X.
    Gongcheng Kexue Xuebao/Chinese Journal of Engineering, 2023, 45 (10): : 1613 - 1629
  • [22] Accelerating Deep Neural Networks implementation: A survey
    Dhouibi, Meriam
    Ben Salem, Ahmed Karim
    Saidi, Afef
    Ben Saoud, Slim
    IET COMPUTERS AND DIGITAL TECHNIQUES, 2021, 15 (02): : 79 - 96
  • [23] Linear Regularized Compression of Deep Convolutional Neural Networks
    Ceruti, Claudio
    Campadelli, Paola
    Casiraghi, Elena
    IMAGE ANALYSIS AND PROCESSING,(ICIAP 2017), PT I, 2017, 10484 : 244 - 253
  • [24] Exploiting Deep Neural Networks for Digital Image Compression
    Hussain, Farhan
    Jeong, Jechang
    2015 2ND WORLD SYMPOSIUM ON WEB APPLICATIONS AND NETWORKING (WSWAN), 2015,
  • [25] A Survey of Attacks and Defenses for Deep Neural Networks
    Machooka, Daniel
    Yuan, Xiaohong
    Esterline, Albert
    2023 IEEE INTERNATIONAL CONFERENCE ON CYBER SECURITY AND RESILIENCE, CSR, 2023, : 254 - 261
  • [26] A Survey on Evolutionary Construction of Deep Neural Networks
    Zhou, Xun
    Qin, A. K.
    Gong, Maoguo
    Tan, Kay Chen
    IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2021, 25 (05) : 894 - 912
  • [27] Survey of scaling platforms for Deep Neural Networks
    Ratnaparkhi, Abhay A.
    Pilli, Emmanuel
    Joshi, R. C.
    2016 INTERNATIONAL CONFERENCE ON EMERGING TRENDS IN COMMUNICATION TECHNOLOGIES (ETCT), 2016,
  • [28] Adaptive joint compression method for deep neural networks
    Yao B.
    Peng X.
    Yu X.
    Liu L.
    Peng Y.
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2023, 44 (05): : 21 - 32
  • [29] Compression of Deep Neural Networks for Image Instance Retrieval
    Chandrasekhar, Vijay
    Lin, Jie
    Liao, Qianli
    Morere, Olivier
    Veillard, Antoine
    Duan, Lingyu
    Poggio, Tomaso
    2017 DATA COMPRESSION CONFERENCE (DCC), 2017, : 300 - 309
  • [30] DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks
    Wiedemann, Simon
    Kirchhoffer, Heiner
    Matlage, Stefan
    Haase, Paul
    Marban, Arturo
    Marinc, Talmaj
    Neumann, David
    Nguyen, Tung
    Schwarz, Heiko
    Wiegand, Thomas
    Marpe, Detlev
    Samek, Wojciech
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2020, 14 (04) : 700 - 714