A comprehensive survey on model compression and acceleration

被引:279
|
作者
Choudhary, Tejalal [1 ]
Mishra, Vipul [1 ]
Goswami, Anurag [1 ]
Sarangapani, Jagannathan [2 ]
机构
[1] Bennett Univ, Greater Noida, India
[2] Missouri Univ Sci & Technol, Rolla, MO 65409 USA
关键词
Model compression and acceleration; Machine learning; Deep learning; CNN; RNN; Resource-constrained devices; Efficient neural networks; NEURAL-NETWORK; PROXIMAL NEWTON; QUANTIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-020-09816-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
引用
收藏
页码:5113 / 5155
页数:43
相关论文
共 50 条
  • [21] A comprehensive review of model compression techniques in machine learning
    Dantas, Pierre Vilar
    da Silva Jr, Waldir Sabino
    Cordeiro, Lucas Carvalho
    Carvalho, Celso Barbosa
    APPLIED INTELLIGENCE, 2024, 54 (22) : 11804 - 11844
  • [22] Trust Management Model in IoT: A Comprehensive Survey
    Saeed, Muhammad
    Aftab, Muhammad
    Amin, Rashid
    Koundal, Deepika
    INNOVATIONS IN BIO-INSPIRED COMPUTING AND APPLICATIONS, IBICA 2021, 2022, 419 : 675 - 684
  • [23] Model Compression and Acceleration for Deep Neural Networks The principles, progress, and challenges
    Cheng, Yu
    Wang, Duo
    Zhou, Pan
    Zhang, Tao
    IEEE SIGNAL PROCESSING MAGAZINE, 2018, 35 (01) : 126 - 136
  • [24] Revisiting Data Augmentation in Model Compression: An Empirical and Comprehensive Study
    Yu, Muzhou
    Zhang, Linfeng
    Ma, Kaisheng
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [25] Compression acceleration using GPGPU
    Shastry, Krishnaprasad
    Pandey, Avinash
    Agrawal, Ashutosh
    Sarveswara, Ravi
    2016 23RD IEEE INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING WORKSHOPS (HIPCW 2016), 2016, : 70 - 78
  • [26] RamanCMP: A Raman spectral classification acceleration method based on lightweight model and model compression techniques
    Gong, Zengyun
    Chen, Chen
    Chen, Cheng
    Li, Chenxi
    Tian, Xuecong
    Gong, Zhongcheng
    Lv, Xiaoyi
    ANALYTICA CHIMICA ACTA, 2023, 1278
  • [27] Model aggregation techniques in federated learning: A comprehensive survey
    Qi, Pian
    Chiaro, Diletta
    Guzzo, Antonella
    Ianni, Michele
    Fortino, Giancarlo
    Piccialli, Francesco
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2024, 150 : 272 - 293
  • [28] Business Process Model Transformation Techniques: A Comprehensive Survey
    Khudori, Ahsanun Naseh
    Kurniawan, Tri Astoto
    ADVANCED SCIENCE LETTERS, 2018, 24 (11) : 8606 - 8612
  • [29] Trust Evaluation Model in IoT Environment: A Comprehensive Survey
    Alhandi, Somya Abdulkarim
    Kamaludin, Hazalila
    Alduais, Nayef Abdulwahab Mohammed
    IEEE ACCESS, 2023, 11 : 11165 - 11182
  • [30] A Comprehensive Survey of Datasets for Large Language Model Evaluation
    Lu, Yuting
    Sun, Chao
    Yan, Yuchao
    Zhu, Hegong
    Song, Dongdong
    Peng, Qing
    Yu, Li
    Wang, Xiaozheng
    Jiang, Jian
    Ye, Xiaolong
    2024 5TH INFORMATION COMMUNICATION TECHNOLOGIES CONFERENCE, ICTC 2024, 2024, : 330 - 336