HeadStart: Enforcing Optimal Inceptions in Pruning Deep Neural Networks for Efficient Inference on GPGPUs

被引：0

作者：

Lin, Ning ^{[1
,2
]}

Lu, Hang ^{[1
,2
]}

Wei, Xin ^{[1
,2
]}

Li, Xiaowei ^{[1
,2
]}

机构：

[1] Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing, Peoples R China

[2] Univ Chinese Acad Sci, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC) | 2019年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1145/3316781.3317837

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Deep convolutional neural networks are well-known for the extensive parameters and computation intensity. Structured pruning is an effective solution to obtain a more compact model for the efficient inference on GPGPUs, without designing specific hardware accelerators. However, previous works resort to certain metrics in channel/filter pruning and count on labor intensive fine-tunings to recover the accuracy loss. The "inception" of the pruned model, as another form factor, has indispensable impact to the final accuracy but its importance is often ignored in these works. In this paper, we prove that optimal inception will be more likely to induce a satisfied performance and shortened fine-tuning iterations. We also propose a reinforcement learning based solution, termed as HeadStart, seeking to learn the best way of pruning aiming at the optimal inception. With the help of the specialized head-start network, it could automatically balance the tradeoff between the final accuracy and the preset speedup rather than tilting to one of them, which makes it differentiated from existing works as well. Experimental results show that HeadStart could attain up to 2.25x inference speedup with only 1.16% accuracy loss tested with large scale images on various GPGPUs, and could be well generalized to various cutting-edge DCNN models.

引用

页数：6

共 50 条

[1] Efficient Distributed Inference of Deep Neural Networks via Restructuring and Pruning
Abdi, Afshin
Rashidi, Saeed
Fekri, Faramarz
Krishna, Tushar
[J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6640 - 6648
[2] Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
Hoefler, Torsten
Alistarh, Dan
Ben-Nun, Tal
Dryden, Nikoli
Peste, Alexandra
[J]. Journal of Machine Learning Research, 2021, 22
[3] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
Hoefler, Torsten
Alistarh, Dan
Ben-Nun, Tal
Dryden, Nikoli
Peste, Alexandra
[J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 23
[4] Redundant feature pruning for accelerated inference in deep neural networks
Ayinde, Babajide O.
Inanc, Tamer
Zurada, Jacek M.
[J]. NEURAL NETWORKS, 2019, 118 : 148 - 158
[5] Structured Term Pruning for Computational Efficient Neural Networks Inference
Huang, Kai
Li, Bowen
Chen, Siang
Claesen, Luc
Xi, Wei
Chen, Junjian
Jiang, Xiaowen
Liu, Zhili
Xiong, Dongliang
Yan, Xiaolang
[J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2023, 42 (01) : 190 - 203
[6] Pruning Deep Neural Networks by Optimal Brain Damage
Liu, Chao
Zhang, Zhiyong
Wang, Dong
[J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1092 - 1095
[7] Trained Rank Pruning for Efficient Deep Neural Networks
Xu, Yuhui
Li, Yuxi
Zhang, Shuai
Wen, Wei
Wang, Botao
Dai, Wenrui
Qi, Yingyong
Chen, Yiran
Lin, Weiyao
Xiong, Hongkai
[J]. FIFTH WORKSHOP ON ENERGY EFFICIENT MACHINE LEARNING AND COGNITIVE COMPUTING - NEURIPS EDITION (EMC2-NIPS 2019), 2019, : 14 - 17
[8] Holistic Filter Pruning for Efficient Deep Neural Networks
Enderich, Lukas
Timm, Fabian
Burgard, Wolfram
[J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2595 - 2604
[9] Optimal pruning in neural networks
Barbato, DML
Kinouchi, O
[J]. PHYSICAL REVIEW E, 2000, 62 (06): : 8387 - 8394
[10] TRP: Trained Rank Pruning for Efficient Deep Neural Networks
Xu, Yuhui
Li, Yuxi
Zhang, Shuai
Wen, Wei
Wang, Botao
Qi, Yingyong
Chen, Yiran
Lin, Weiyao
Xiong, Hongkai
[J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 977 - 983

← 1 2 3 4 5 →