A comprehensive survey on model compression and acceleration

被引:279
|
作者
Choudhary, Tejalal [1 ]
Mishra, Vipul [1 ]
Goswami, Anurag [1 ]
Sarangapani, Jagannathan [2 ]
机构
[1] Bennett Univ, Greater Noida, India
[2] Missouri Univ Sci & Technol, Rolla, MO 65409 USA
关键词
Model compression and acceleration; Machine learning; Deep learning; CNN; RNN; Resource-constrained devices; Efficient neural networks; NEURAL-NETWORK; PROXIMAL NEWTON; QUANTIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-020-09816-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
引用
收藏
页码:5113 / 5155
页数:43
相关论文
共 50 条
  • [1] A comprehensive survey on model compression and acceleration
    Tejalal Choudhary
    Vipul Mishra
    Anurag Goswami
    Jagannathan Sarangapani
    Artificial Intelligence Review, 2020, 53 : 5113 - 5155
  • [2] Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey
    Deng, Lei
    Li, Guoqi
    Han, Song
    Shi, Luping
    Xie, Yuan
    PROCEEDINGS OF THE IEEE, 2020, 108 (04) : 485 - 532
  • [3] Survey of Deep Learning Model Compression and Acceleration
    Gao H.
    Tian Y.-L.
    Xu F.-Y.
    Zhong S.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (01): : 68 - 92
  • [4] A Survey on Model Compression and Acceleration for Pretrained Language Models
    Xu, Canwen
    McAuley, Julian
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 9, 2023, : 10566 - 10575
  • [5] A survey on GAN acceleration using memory compression techniques
    Tantawy D.
    Zahran M.
    Wassal A.
    Journal of Engineering and Applied Science, 2021, 68 (01):
  • [6] SlimNets: An Exploration of Deep Model Compression and Acceleration
    Oguntola, Ini
    Olubeko, Subby
    Sweeney, Christopher
    2018 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2018,
  • [7] Hardware-friendly compression and hardware acceleration for transformer: A survey
    Huang, Shizhen
    Tang, Enhao
    Li, Shun
    Ping, Xiangzhan
    Chen, Ruiqi
    ELECTRONIC RESEARCH ARCHIVE, 2022, 30 (10): : 3755 - 3785
  • [8] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
    He, Yihui
    Lin, Ji
    Liu, Zhijian
    Wang, Hanrui
    Li, Li-Jia
    Han, Song
    COMPUTER VISION - ECCV 2018, PT VII, 2018, 11211 : 815 - 832
  • [9] A Comprehensive Survey on Training Acceleration for Large Machine Learning Models in IoT
    Wang, Haozhao
    Qu, Zhihao
    Zhou, Qihua
    Zhang, Haobo
    Luo, Boyuan
    Xu, Wenchao
    Guo, Song
    Li, Ruixuan
    IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (02) : 939 - 963
  • [10] Learning-driven lossy image compression: A comprehensive survey
    Jamil, Sonain
    Piran, Md. Jalil
    Rahman, MuhibUr
    Kwon, Oh-Jin
    Engineering Applications of Artificial Intelligence, 2023, 123