Model Compression Based on Differentiable Network Channel Pruning

被引:27
|
作者
Zheng, Yu-Jie [1 ,2 ]
Chen, Si-Bao [1 ,2 ]
Ding, Chris H. Q. [3 ]
Luo, Bin [1 ,2 ]
机构
[1] Anhui Univ, IMIS Lab Anhui Prov, Anhui Prov Key Lab Multimodal Cognit Computat, MOE Key Lab Intelligent Comp & Signal Proc ICSP, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[3] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
基金
中国国家自然科学基金;
关键词
Computational modeling; Training; Network architecture; Neural networks; Evolutionary computation; Computer architecture; Image coding; Channel pruning; convolutional neural network; differentiable method; model compression; neural network pruning;
D O I
10.1109/TNNLS.2022.3165123
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although neural networks have achieved great success in various fields, applications on mobile devices are limited by the computational and storage costs required for large models. The model compression (neural network pruning) technology can significantly reduce network parameters and improve computational efficiency. In this article, we propose a differentiable network channel pruning (DNCP) method for model compression. Unlike existing methods that require sampling and evaluation of a large number of substructures, our method can efficiently search for optimal substructure that meets resource constraints (e.g., FLOPs) through gradient descent. Specifically, we assign a learnable probability to each possible number of channels in each layer of the network, relax the selection of a particular number of channels to a softmax over all possible numbers of channels, and optimize the learnable probability in an end-to-end manner through gradient descent. After the network parameters are optimized, we prune the network according to the learnable probability to obtain the optimal substructure. To demonstrate the effectiveness and efficiency of DNCP, experiments are conducted with ResNet and MobileNet V2 on CIFAR, Tiny ImageNet, and ImageNet datasets.
引用
收藏
页码:10203 / 10212
页数:10
相关论文
共 50 条
  • [41] A lightweight deep neural network model and its applications based on channel pruning and group vector quantization
    Huang, Mingzhong
    Liu, Yan
    Zhao, Lijie
    Wang, Guogang
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 36 (10): : 5333 - 5346
  • [42] Automated Pruning for Deep Neural Network Compression
    Manessi, Franco
    Rozza, Alessandro
    Bianco, Simone
    Napoletano, Paolo
    Schettini, Raimondo
    [J]. 2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 657 - 664
  • [43] Quantisation and Pruning for Neural Network Compression and Regularisation
    Paupamah, Kimessha
    James, Steven
    Klein, Richard
    [J]. 2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 295 - 300
  • [44] ON THE ROLE OF STRUCTURED PRUNING FOR NEURAL NETWORK COMPRESSION
    Bragagnolo, Andrea
    Tartaglione, Enzo
    Fiandrotti, Attilio
    Grangetto, Marco
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3527 - 3531
  • [45] Group Fisher Pruning for Practical Network Compression
    Liu, Liyang
    Zhang, Shilong
    Kuang, Zhanghui
    Zhou, Aojun
    Xue, Jing-Hao
    Wang, Xinjiang
    Chen, Yimin
    Yang, Wenming
    Liao, Qingmin
    Zhang, Wayne
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [46] Consecutive layer collaborative filter similarity for differentiable neural network pruning
    Zu, Xuan
    Li, Yun
    Yin, Baoqun
    [J]. NEUROCOMPUTING, 2023, 533 : 35 - 45
  • [47] Neural Network Compression and Acceleration by Federated Pruning
    Pei, Songwen
    Wu, Yusheng
    Qiu, Meikang
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2020, PT II, 2020, 12453 : 173 - 183
  • [48] A framework for deep neural network multiuser authorization based on channel pruning
    Wang, Linna
    Song, Yunfei
    Zhu, Yujia
    Xia, Daoxun
    Han, Guoquan
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2023, 35 (21):
  • [49] SIECP: Neural Network Channel Pruning based on Sequential Interval Estimation
    Chen, Si-Bao
    Zheng, Yu-Jie
    Ding, Chris H. Q.
    Luo, Bin
    [J]. NEUROCOMPUTING, 2022, 481 : 1 - 10
  • [50] Deep neural network compression through interpretability-based filter pruning
    Yao, Kaixuan
    Cao, Feilong
    Leung, Yee
    Liang, Jiye
    [J]. PATTERN RECOGNITION, 2021, 119