Model Compression Based on Differentiable Network Channel Pruning

被引：27

作者：

Zheng, Yu-Jie ^{[1
,2
]}

Chen, Si-Bao ^{[1
,2
]}

Ding, Chris H. Q. ^{[3
]}

Luo, Bin ^{[1
,2
]}

机构：

[1] Anhui Univ, IMIS Lab Anhui Prov, Anhui Prov Key Lab Multimodal Cognit Computat, MOE Key Lab Intelligent Comp & Signal Proc ICSP, Hefei 230601, Peoples R China

[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China

[3] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2023年 / 34卷 / 12期

基金：

中国国家自然科学基金;

关键词：

Computational modeling; Training; Network architecture; Neural networks; Evolutionary computation; Computer architecture; Image coding; Channel pruning; convolutional neural network; differentiable method; model compression; neural network pruning;

D O I：

10.1109/TNNLS.2022.3165123

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Although neural networks have achieved great success in various fields, applications on mobile devices are limited by the computational and storage costs required for large models. The model compression (neural network pruning) technology can significantly reduce network parameters and improve computational efficiency. In this article, we propose a differentiable network channel pruning (DNCP) method for model compression. Unlike existing methods that require sampling and evaluation of a large number of substructures, our method can efficiently search for optimal substructure that meets resource constraints (e.g., FLOPs) through gradient descent. Specifically, we assign a learnable probability to each possible number of channels in each layer of the network, relax the selection of a particular number of channels to a softmax over all possible numbers of channels, and optimize the learnable probability in an end-to-end manner through gradient descent. After the network parameters are optimized, we prune the network according to the learnable probability to obtain the optimal substructure. To demonstrate the effectiveness and efficiency of DNCP, experiments are conducted with ResNet and MobileNet V2 on CIFAR, Tiny ImageNet, and ImageNet datasets.

引用

页码：10203 / 10212

页数：10

共 50 条

[41] A lightweight deep neural network model and its applications based on channel pruning and group vector quantization
Huang, Mingzhong
Liu, Yan
Zhao, Lijie
Wang, Guogang
[J]. NEURAL COMPUTING & APPLICATIONS, 2023, 36 (10): : 5333 - 5346
[42] Automated Pruning for Deep Neural Network Compression
Manessi, Franco
Rozza, Alessandro
Bianco, Simone
Napoletano, Paolo
Schettini, Raimondo
[J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 657 - 664
[43] Quantisation and Pruning for Neural Network Compression and Regularisation
Paupamah, Kimessha
James, Steven
Klein, Richard
[J]. 2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 295 - 300
[44] ON THE ROLE OF STRUCTURED PRUNING FOR NEURAL NETWORK COMPRESSION
Bragagnolo, Andrea
Tartaglione, Enzo
Fiandrotti, Attilio
Grangetto, Marco
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3527 - 3531
[45] Group Fisher Pruning for Practical Network Compression
Liu, Liyang
Zhang, Shilong
Kuang, Zhanghui
Zhou, Aojun
Xue, Jing-Hao
Wang, Xinjiang
Chen, Yimin
Yang, Wenming
Liao, Qingmin
Zhang, Wayne
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[46] Consecutive layer collaborative filter similarity for differentiable neural network pruning
Zu, Xuan
Li, Yun
Yin, Baoqun
[J]. NEUROCOMPUTING, 2023, 533 : 35 - 45
[47] Neural Network Compression and Acceleration by Federated Pruning
Pei, Songwen
Wu, Yusheng
Qiu, Meikang
[J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 173 - 183
[48] A framework for deep neural network multiuser authorization based on channel pruning
Wang, Linna
Song, Yunfei
Zhu, Yujia
Xia, Daoxun
Han, Guoquan
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
[49] SIECP: Neural Network Channel Pruning based on Sequential Interval Estimation
Chen, Si-Bao
Zheng, Yu-Jie
Ding, Chris H. Q.
Luo, Bin
[J]. NEUROCOMPUTING, 2022, 481 : 1 - 10
[50] Deep neural network compression through interpretability-based filter pruning
Yao, Kaixuan
Cao, Feilong
Leung, Yee
Liang, Jiye
[J]. PATTERN RECOGNITION, 2021, 119

← 1 2 3 4 5 →