Model Compression for Deep Neural Networks: A Survey

被引:68
|
作者
Li, Zhuo [1 ]
Li, Hengyi [1 ]
Meng, Lin [2 ]
机构
[1] Ritsumeikan Univ, Grad Sch Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan
[2] Ritsumeikan Univ, Coll Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan
关键词
deep neural networks; model compression; model pruning; parameter quantization; low-rank decomposition; knowledge distillation; lightweight model design; KNOWLEDGE;
D O I
10.3390/computers12030060
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Currently, with the rapid development of deep learning, deep neural networks (DNNs) have been widely applied in various computer vision tasks. However, in the pursuit of performance, advanced DNN models have become more complex, which has led to a large memory footprint and high computation demands. As a result, the models are difficult to apply in real time. To address these issues, model compression has become a focus of research. Furthermore, model compression techniques play an important role in deploying models on edge devices. This study analyzed various model compression methods to assist researchers in reducing device storage space, speeding up model inference, reducing model complexity and training costs, and improving model deployment. Hence, this paper summarized the state-of-the-art techniques for model compression, including model pruning, parameter quantization, low-rank decomposition, knowledge distillation, and lightweight model design. In addition, this paper discusses research challenges and directions for future work.
引用
收藏
页数:22
相关论文
共 50 条
  • [21] Platform-specific Model Compression for Deep Neural Networks with Joint Methods
    Northeastern University
  • [22] Model Preserving Compression for Neural Networks
    Chee, Jerry
    Flynn , Megan
    Damle, Anil
    De Sa, Christopher
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [23] Survey of Deep Learning Model Compression and Acceleration
    Gao H.
    Tian Y.-L.
    Xu F.-Y.
    Zhong S.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (01): : 68 - 92
  • [24] Multi-Resolution Model Compression for Deep Neural Networks: A Variational Bayesian Approach
    Xia, Chengyu
    Guo, Huayan
    Ma, Haoyu
    Tsang, Danny H. K.
    Lau, Vincent K. N.
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 1944 - 1959
  • [25] Deep Model Compression for Mobile Platforms: A Survey
    Nan, Kaiming
    Liu, Sicong
    Du, Junzhao
    Liu, Hui
    TSINGHUA SCIENCE AND TECHNOLOGY, 2019, 24 (06) : 677 - 693
  • [26] Model Compression Hardens Deep Neural Networks: A New Perspective to Prevent Adversarial Attacks
    Liu, Qi
    Wen, Wujie
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 3 - 14
  • [27] Deep Model Compression for Mobile Platforms:A Survey
    Kaiming Nan
    Sicong Liu
    Junzhao Du
    Hui Liu
    Tsinghua Science and Technology, 2019, 24 (06) : 677 - 693
  • [28] Deep Neural Networks and Tabular Data: A Survey
    Borisov, Vadim
    Leemann, Tobias
    Sessler, Kathrin
    Haug, Johannes
    Pawelczyk, Martin
    Kasneci, Gjergji
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 7499 - 7519
  • [29] A Survey of Accelerator Architectures for Deep Neural Networks
    Chen, Yiran
    Xie, Yuan
    Song, Linghao
    Chen, Fan
    Tang, Tianqi
    ENGINEERING, 2020, 6 (03) : 264 - 274
  • [30] Survey on Deep Convolutional Neural Networks in Mammography
    Abdelhafiz, Dina
    Nabavi, Sheida
    Ammar, Reda
    Yang, Clifford
    2017 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2017,