HeadStart: Enforcing Optimal Inceptions in Pruning Deep Neural Networks for Efficient Inference on GPGPUs

被引:0
|
作者
Lin, Ning [1 ,2 ]
Lu, Hang [1 ,2 ]
Wei, Xin [1 ,2 ]
Li, Xiaowei [1 ,2 ]
机构
[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1145/3316781.3317837
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Deep convolutional neural networks are well-known for the extensive parameters and computation intensity. Structured pruning is an effective solution to obtain a more compact model for the efficient inference on GPGPUs, without designing specific hardware accelerators. However, previous works resort to certain metrics in channel/filter pruning and count on labor intensive fine-tunings to recover the accuracy loss. The "inception" of the pruned model, as another form factor, has indispensable impact to the final accuracy but its importance is often ignored in these works. In this paper, we prove that optimal inception will be more likely to induce a satisfied performance and shortened fine-tuning iterations. We also propose a reinforcement learning based solution, termed as HeadStart, seeking to learn the best way of pruning aiming at the optimal inception. With the help of the specialized head-start network, it could automatically balance the tradeoff between the final accuracy and the preset speedup rather than tilting to one of them, which makes it differentiated from existing works as well. Experimental results show that HeadStart could attain up to 2.25x inference speedup with only 1.16% accuracy loss tested with large scale images on various GPGPUs, and could be well generalized to various cutting-edge DCNN models.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Efficient Distributed Inference of Deep Neural Networks via Restructuring and Pruning
    Abdi, Afshin
    Rashidi, Saeed
    Fekri, Faramarz
    Krishna, Tushar
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6640 - 6648
  • [2] Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    [J]. Journal of Machine Learning Research, 2021, 22
  • [3] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 23
  • [4] Redundant feature pruning for accelerated inference in deep neural networks
    Ayinde, Babajide O.
    Inanc, Tamer
    Zurada, Jacek M.
    [J]. NEURAL NETWORKS, 2019, 118 : 148 - 158
  • [5] Structured Term Pruning for Computational Efficient Neural Networks Inference
    Huang, Kai
    Li, Bowen
    Chen, Siang
    Claesen, Luc
    Xi, Wei
    Chen, Junjian
    Jiang, Xiaowen
    Liu, Zhili
    Xiong, Dongliang
    Yan, Xiaolang
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (01) : 190 - 203
  • [6] Pruning Deep Neural Networks by Optimal Brain Damage
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1092 - 1095
  • [7] Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Dai, Wenrui
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    [J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 14 - 17
  • [8] Holistic Filter Pruning for Efficient Deep Neural Networks
    Enderich, Lukas
    Timm, Fabian
    Burgard, Wolfram
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2595 - 2604
  • [9] Optimal pruning in neural networks
    Barbato, DML
    Kinouchi, O
    [J]. PHYSICAL REVIEW E, 2000, 62 (06): : 8387 - 8394
  • [10] TRP: Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 977 - 983