A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks

被引:8
|
作者
Ma, Yuzhe [1 ]
Chen, Ran [1 ]
Li, Wei [1 ]
Shang, Fanhua [2 ]
Yu, Wenjian [3 ]
Cho, Minsik [4 ]
Yu, Bei [1 ]
机构
[1] Chinese Univ Hong Kong, CSE Dept, Hong Kong, Peoples R China
[2] Xidian Univ, Sch Artificial Intelligence, Xian, Peoples R China
[3] Tsinghua Univ, Dept Comp Sci & Tech, BNRist, Beijing, Peoples R China
[4] IBM TJ Watson, Yorktown Hts, NY USA
关键词
D O I
10.1109/ICTAI.2019.00060
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural networks (DNNs) have achieved significant success in a variety of real world applications, i.e., image classification. However, tons of parameters in the networks restrict the efficiency of neural networks due to the large model size and the intensive computation. To address this issue, various approximation techniques have been investigated, which seek for a light weighted network with little performance degradation in exchange of smaller model size or faster inference. Both low rankness and sparsity are appealing properties for the network approximation. In this paper we propose a unified framework to compress the convolutional neural networks (CNNs) by combining these two properties, while taking the nonlinear activation into consideration. Each layer in the network is approximated by the sum of a structured sparse component and a low -rank component, which is formulated as an optimization problem. Then, an extended version of alternating direction method of multipliers (ADMM) with guaranteed convergence is presented to solve the relaxed optimization problem. Experiments are carried out on VGG-16, AlexNet and GoogLeNet with large image classification datasets. The results outperform previous work in terms of accuracy degradation, compression rate and speedup ratio. The proposed method is able to remarkably compress the model (with up to 4.9 x reduction of parameters) at a cost of little loss or without loss on accuracy.
引用
收藏
页码:376 / 383
页数:8
相关论文
共 50 条
  • [41] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406
  • [42] Accelerating Training for Distributed Deep Neural Networks in MapReduce
    Xu, Jie
    Wang, Jingyu
    Qi, Qi
    Sun, Haifeng
    Liao, Jianxin
    [J]. WEB SERVICES - ICWS 2018, 2018, 10966 : 181 - 195
  • [43] A Knee-Guided Evolutionary Algorithm for Compressing Deep Neural Networks
    Zhou, Yao
    Yen, Gary G.
    Yi, Zhang
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1626 - 1638
  • [44] Compressing Deep Graph Neural Networks via Adversarial Knowledge Distillation
    He, Huarui
    Wang, Jie
    Zhang, Zhanqiu
    Wu, Feng
    [J]. PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 534 - 544
  • [45] A Unified Framework of Deep Neural Networks and Gappy Proper Orthogonal Decomposition for Global Field Reconstruction
    Zhao, Xiaoyu
    Gong, Zhiqiang
    Chen, Xiaoqian
    Yao, Wen
    Zhang, Yunyang
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [46] A Study on Deep Neural Networks Framework
    Huang Yi
    Duan Xiusheng
    Sun Shiyu
    Chen Zhigang
    [J]. PROCEEDINGS OF 2016 IEEE ADVANCED INFORMATION MANAGEMENT, COMMUNICATES, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (IMCEC 2016), 2016, : 1519 - 1522
  • [47] Compressing and Accelerating Neural Network for Facial Point Localization
    Dan Zeng
    Fan Zhao
    Wei Shen
    Shiming Ge
    [J]. Cognitive Computation, 2018, 10 : 359 - 367
  • [48] On the approximation of rough functions with deep neural networks
    De Ryck T.
    Mishra S.
    Ray D.
    [J]. SeMA Journal, 2022, 79 (3) : 399 - 440
  • [49] Limitations on approximation by deep and shallow neural networks
    Petrova, Guergana
    Wojtaszczyk, Przemyslaw
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [50] Provable approximation properties for deep neural networks
    Shaham, Uri
    Cloninger, Alexander
    Coifman, Ronald R.
    [J]. APPLIED AND COMPUTATIONAL HARMONIC ANALYSIS, 2018, 44 (03) : 537 - 557