Model Compression for Deep Neural Networks: A Survey

被引：68

作者：

Li, Zhuo ^{[1
]}

Li, Hengyi ^{[1
]}

Meng, Lin ^{[2
]}

机构：

[1] Ritsumeikan Univ, Grad Sch Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan

[2] Ritsumeikan Univ, Coll Sci & Engn, 1-1-1 Noji Higashi, Kusatsu 5258577, Japan

来源：

COMPUTERS | 2023年 / 12卷 / 03期

关键词：

deep neural networks; model compression; model pruning; parameter quantization; low-rank decomposition; knowledge distillation; lightweight model design; KNOWLEDGE;

D O I：

10.3390/computers12030060

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Currently, with the rapid development of deep learning, deep neural networks (DNNs) have been widely applied in various computer vision tasks. However, in the pursuit of performance, advanced DNN models have become more complex, which has led to a large memory footprint and high computation demands. As a result, the models are difficult to apply in real time. To address these issues, model compression has become a focus of research. Furthermore, model compression techniques play an important role in deploying models on edge devices. This study analyzed various model compression methods to assist researchers in reducing device storage space, speeding up model inference, reducing model complexity and training costs, and improving model deployment. Hence, this paper summarized the state-of-the-art techniques for model compression, including model pruning, parameter quantization, low-rank decomposition, knowledge distillation, and lightweight model design. In addition, this paper discusses research challenges and directions for future work.

引用

页数：22

共 50 条

[21] Platform-specific Model Compression for Deep Neural Networks with Joint Methods
Northeastern University
[22] Model Preserving Compression for Neural Networks
Chee, Jerry
Flynn , Megan
Damle, Anil
De Sa, Christopher
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[23] Survey of Deep Learning Model Compression and Acceleration
Gao H.
Tian Y.-L.
Xu F.-Y.
Zhong S.
Ruan Jian Xue Bao/Journal of Software, 2021, 32 (01): : 68 - 92
[24] Multi-Resolution Model Compression for Deep Neural Networks: A Variational Bayesian Approach
Xia, Chengyu
Guo, Huayan
Ma, Haoyu
Tsang, Danny H. K.
Lau, Vincent K. N.
IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 1944 - 1959
[25] Deep Model Compression for Mobile Platforms: A Survey
Nan, Kaiming
Liu, Sicong
Du, Junzhao
Liu, Hui
TSINGHUA SCIENCE AND TECHNOLOGY, 2019, 24 (06) : 677 - 693
[26] Model Compression Hardens Deep Neural Networks: A New Perspective to Prevent Adversarial Attacks
Liu, Qi
Wen, Wujie
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 34 (01) : 3 - 14
[27] Deep Model Compression for Mobile Platforms:A Survey
Kaiming Nan
Sicong Liu
Junzhao Du
Hui Liu
Tsinghua Science and Technology, 2019, 24 (06) : 677 - 693
[28] Deep Neural Networks and Tabular Data: A Survey
Borisov, Vadim
Leemann, Tobias
Sessler, Kathrin
Haug, Johannes
Pawelczyk, Martin
Kasneci, Gjergji
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 7499 - 7519
[29] A Survey of Accelerator Architectures for Deep Neural Networks
Chen, Yiran
Xie, Yuan
Song, Linghao
Chen, Fan
Tang, Tianqi
ENGINEERING, 2020, 6 (03) : 264 - 274
[30] Survey on Deep Convolutional Neural Networks in Mammography
Abdelhafiz, Dina
Nabavi, Sheida
Ammar, Reda
Yang, Clifford
2017 IEEE 7TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL ADVANCES IN BIO AND MEDICAL SCIENCES (ICCABS), 2017,

← 1 2 3 4 5 →