A comprehensive survey on model compression and acceleration

被引:279
|
作者
Choudhary, Tejalal [1 ]
Mishra, Vipul [1 ]
Goswami, Anurag [1 ]
Sarangapani, Jagannathan [2 ]
机构
[1] Bennett Univ, Greater Noida, India
[2] Missouri Univ Sci & Technol, Rolla, MO 65409 USA
关键词
Model compression and acceleration; Machine learning; Deep learning; CNN; RNN; Resource-constrained devices; Efficient neural networks; NEURAL-NETWORK; PROXIMAL NEWTON; QUANTIZATION; CLASSIFICATION; ALGORITHM;
D O I
10.1007/s10462-020-09816-7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In recent years, machine learning (ML) and deep learning (DL) have shown remarkable improvement in computer vision, natural language processing, stock prediction, forecasting, and audio processing to name a few. The size of the trained DL model is large for these complex tasks, which makes it difficult to deploy on resource-constrained devices. For instance, size of the pre-trained VGG16 model trained on the ImageNet dataset is more than 500 MB. Resource-constrained devices such as mobile phones and internet of things devices have limited memory and less computation power. For real-time applications, the trained models should be deployed on resource-constrained devices. Popular convolutional neural network models have millions of parameters that leads to increase in the size of the trained model. Hence, it becomes essential to compress and accelerate these models before deploying on resource-constrained devices while making the least compromise with the model accuracy. It is a challenging task to retain the same accuracy after compressing the model. To address this challenge, in the last couple of years many researchers have suggested different techniques for model compression and acceleration. In this paper, we have presented a survey of various techniques suggested for compressing and accelerating the ML and DL models. We have also discussed the challenges of the existing techniques and have provided future research directions in the field.
引用
收藏
页码:5113 / 5155
页数:43
相关论文
共 50 条
  • [31] A Survey on FHE Acceleration
    Latibari, Banafsheh Saber
    Gubbi, Kevin Immanuel
    Homayoun, Houman
    Sasan, Avesta
    2023 IEEE 16TH DALLAS CIRCUITS AND SYSTEMS CONFERENCE, DCAS, 2023,
  • [32] A Survey of Acceleration Techniques for SMT-based Bounded Model Checking
    Liu, Leyuan
    Kong, Weiqiang
    Ando, Takahiro
    Yatsu, Hirokazu
    Fukuda, Akira
    2013 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCES AND APPLICATIONS (CSA), 2013, : 554 - 559
  • [33] A Survey of Model Compression and Its Feedback Mechanism in Federated Learning
    Le, Duy-Dong
    Tran, Anh-Khoa
    Pham, The-Bao
    Huynh, Tuong-Nguyen
    PROCEEDINGS OF THE 5TH ACM WORKSHOP ON INTELLIGENT CROSS-DATA ANALYSIS AND RETRIEVAL, ICDAR 2024, 2024, : 37 - 42
  • [34] Deep Model Compression and Architecture Optimization for Embedded Systems: A Survey
    Anthony Berthelier
    Thierry Chateau
    Stefan Duffner
    Christophe Garcia
    Christophe Blanc
    Journal of Signal Processing Systems, 2021, 93 : 863 - 878
  • [35] Computer Vision Model Compression Techniques for Embedded Systems: A Survey
    Lopes, Alexandre
    dos Santos, Fernando Pereira
    de Oliveira, Diulhio
    Schiezaro, Mauricio
    Pedrini, Helio
    COMPUTERS & GRAPHICS-UK, 2024, 123
  • [36] Deep Model Compression and Architecture Optimization for Embedded Systems: A Survey
    Berthelier, Anthony
    Chateau, Thierry
    Duffner, Stefan
    Garcia, Christophe
    Blanc, Christophe
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2021, 93 (08): : 863 - 878
  • [37] ACCELERATION OF URETERAL PERISTALSIS BY ADRENAL COMPRESSION
    BOYARSKY, S
    LABAY, P
    KIRSHNER, N
    SCIENCE, 1966, 154 (3749) : 669 - &
  • [38] ACCELERATION OF A BODY BY THE PISTON COMPRESSION OF GAS
    MAKSIMOV, VF
    FILIPPOV, YG
    VESTNIK MOSKOVSKOGO UNIVERSITETA SERIYA 1 MATEMATIKA MEKHANIKA, 1987, (01): : 75 - 78
  • [39] COMPACT TOROID FORMATION, COMPRESSION, AND ACCELERATION
    DEGNAN, JH
    PETERKIN, RE
    BACA, GP
    BEASON, JD
    BELL, DE
    DEARBORN, ME
    DIETZ, D
    DOUGLAS, MR
    ENGLERT, SE
    ENGLERT, TJ
    HACKETT, KE
    HOLMES, JH
    HUSSEY, TW
    KIUTTU, GF
    LEHR, FM
    MARKLIN, GJ
    MULLINS, BW
    PRICE, DW
    RODERICK, NF
    RUDEN, EL
    SOVINEC, CR
    TURCHI, PJ
    BIRD, G
    COFFEY, SK
    SEILER, SW
    CHEN, YG
    GALE, D
    GRAHAM, JD
    SCOTT, M
    SOMMARS, W
    PHYSICS OF FLUIDS B-PLASMA PHYSICS, 1993, 5 (08): : 2938 - 2958
  • [40] COMPRESSION AND ACCELERATION OF NEURAL NETWORKS FOR COMMUNICATIONS
    Guo, Jiajia
    Wang, Jinghe
    Wen, Chao-Kai
    Jin, Shi
    Li, Geoffrey Ye
    IEEE WIRELESS COMMUNICATIONS, 2020, 27 (04) : 110 - 117