A comprehensive survey on model compression and acceleration

被引：279

作者：

Choudhary, Tejalal ^{[1
]}

Mishra, Vipul ^{[1
]}

Goswami, Anurag ^{[1
]}

Sarangapani, Jagannathan ^{[2
]}

机构：

[1] Bennett Univ, Greater Noida, India

[2] Missouri Univ Sci & Technol, Rolla, MO 65409 USA

来源：

ARTIFICIAL INTELLIGENCE REVIEW | 2020年 / 53卷 / 07期

关键词：

Model compression and acceleration; Machine learning; Deep learning; CNN; RNN; Resource-constrained devices; Efficient neural networks; NEURAL-NETWORK; PROXIMAL NEWTON; QUANTIZATION; CLASSIFICATION; ALGORITHM;

D O I：

10.1007/s10462-020-09816-7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.

引用

页码：5113 / 5155

页数：43

共 50 条

[21] A comprehensive review of model compression techniques in machine learning
Dantas, Pierre Vilar
da Silva Jr, Waldir Sabino
Cordeiro, Lucas Carvalho
Carvalho, Celso Barbosa
APPLIED INTELLIGENCE, 2024, 54 (22) : 11804 - 11844
[22] Trust Management Model in IoT: A Comprehensive Survey
Saeed, Muhammad
Aftab, Muhammad
Amin, Rashid
Koundal, Deepika
INNOVATIONS IN BIO-INSPIRED COMPUTING AND APPLICATIONS, IBICA 2021, 2022, 419 : 675 - 684
[23] Model Compression and Acceleration for Deep Neural Networks The principles, progress, and challenges
Cheng, Yu
Wang, Duo
Zhou, Pan
Zhang, Tao
IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (01) : 126 - 136
[24] Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study
Yu, Muzhou
Zhang, Linfeng
Ma, Kaisheng
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[25] Compression acceleration using GPGPU
Shastry, Krishnaprasad
Pandey, Avinash
Agrawal, Ashutosh
Sarveswara, Ravi
2016 23RD IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING WORKSHOPS (HIPCW 2016), 2016, : 70 - 78
[26] RamanCMP: A Raman spectral classification acceleration method based on lightweight model and model compression techniques
Gong, Zengyun
Chen, Chen
Chen, Cheng
Li, Chenxi
Tian, Xuecong
Gong, Zhongcheng
Lv, Xiaoyi
ANALYTICA CHIMICA ACTA, 2023, 1278
[27] Model aggregation techniques in federated learning: A comprehensive survey
Qi, Pian
Chiaro, Diletta
Guzzo, Antonella
Ianni, Michele
Fortino, Giancarlo
Piccialli, Francesco
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 150 : 272 - 293
[28] Business Process Model Transformation Techniques: A Comprehensive Survey
Khudori, Ahsanun Naseh
Kurniawan, Tri Astoto
ADVANCED SCIENCE LETTERS, 2018, 24 (11) : 8606 - 8612
[29] Trust Evaluation Model in IoT Environment: A Comprehensive Survey
Alhandi, Somya Abdulkarim
Kamaludin, Hazalila
Alduais, Nayef Abdulwahab Mohammed
IEEE ACCESS, 2023, 11 : 11165 - 11182
[30] A Comprehensive Survey of Datasets for Large Language Model Evaluation
Lu, Yuting
Sun, Chao
Yan, Yuchao
Zhu, Hegong
Song, Dongdong
Peng, Qing
Yu, Li
Wang, Xiaozheng
Jiang, Jian
Ye, Xiaolong
2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 330 - 336

← 1 2 3 4 5 →