A comprehensive survey on model compression and acceleration

被引:279
|
作者
Choudhary, Tejalal [1 ]
Mishra, Vipul [1 ]
Goswami, Anurag [1 ]
Sarangapani, Jagannathan [2 ]
机构
[1] Bennett Univ, Greater Noida, India
[2] Missouri Univ Sci & Technol, Rolla, MO 65409 USA
关键词
Model compression and acceleration; Machine learning; Deep learning; CNN; RNN; Resource-constrained devices; Efficient neural networks; NEURAL-NETWORK; PROXIMAL NEWTON; QUANTIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-020-09816-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
引用
收藏
页码:5113 / 5155
页数:43
相关论文
共 50 条
  • [41] FPGA Acceleration of Zstd Compression Algorithm
    Chen, Jianyu
    Daverveldt, Maurice
    Al-Ars, Zaid
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 188 - 191
  • [42] Acceleration of algorithm of fractal image compression
    Pereguda, ES
    SIBCON-2005: IEEE International Siberian Conference on Control and Communications, 2005, : 159 - 162
  • [43] Formation, compression, and acceleration of magnetized plasmas
    Degnan, JH
    Bell, DE
    Chesley, AL
    Coffey, SK
    Eddleman, JL
    Englert, SE
    Englert, TJ
    Frese, MH
    Gale, DG
    Graham, JD
    Hammer, J
    Hartman, CW
    Havranek, J
    Hussey, TW
    Kiuttu, GF
    Lehr, FM
    Marklin, GJ
    McLean, HS
    Molvik, AW
    Holmberg, CD
    Outten, CA
    Peterkin, RE
    Price, DW
    Roderick, NF
    Ruden, EL
    Shumlak, U
    Turchi, PJ
    Watrous, JJ
    CURRENT TRENDS IN INTERNATIONAL FUSION RESEARCH, 1997, : 179 - 195
  • [44] Two-Stage Model Compression and Acceleration: Optimal Student Network for Better Performance
    Tang, Jialiang
    Jiang, Ning
    Yu, Wenxin
    Zhou, Jinjia
    Mai, Liuwei
    IEEE ACCESS, 2020, 8 : 217816 - 217829
  • [45] Model Compression and Acceleration: Lip Recognition Based on Channel-Level Structured Pruning
    Lu, Yuanyao
    Ni, Ran
    Wen, Jing
    APPLIED SCIENCES-BASEL, 2022, 12 (20):
  • [46] A high-resolution comprehensive water quality model based on GPU acceleration techniques
    Luan, Guangxue
    Hou, Jingming
    Yang, Lu
    Wang, Tian
    Pan, Zhanpeng
    Li, Donglai
    Gao, Xujun
    Fan, Chao
    JOURNAL OF HYDROLOGY, 2023, 617
  • [47] Survey of technology for TCP acceleration
    Wang, Sheng
    Su, Jin-Shu
    Ruan Jian Xue Bao/Journal of Software, 2004, 15 (11): : 1689 - 1699
  • [48] Comprehensive and Efficient Workload Compression
    Deep, Shaleen
    Gruenheid, Anja
    Koutris, Paraschos
    Naughton, Jeffrey
    Viglas, Stratis
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2020, 14 (03): : 418 - 430
  • [49] Acceleration element for femtosecond electron pulse compression
    Qian, Bao-Liang
    Elsayed-Ali, Hani E.
    Physical Review E - Statistical, Nonlinear, and Soft Matter Physics, 2002, 65 (04): : 1 - 046502
  • [50] Particle acceleration in solar wind compression regions
    Giacalone, J
    Jokipii, JR
    Kóta, J
    ASTROPHYSICAL JOURNAL, 2002, 573 (02): : 845 - 850