Trained Rank Pruning for Efficient Deep Neural Networks

被引:11
|
作者
Xu, Yuhui [1 ]
Li, Yuxi [1 ]
Zhang, Shuai [2 ]
Wen, Wei [3 ]
Wang, Botao [2 ]
Dai, Wenrui [1 ]
Qi, Yingyong [2 ]
Chen, Yiran [3 ]
Lin, Weiyao [1 ]
Xiong, Hongkai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Qualcomm AI Res, San Diego, CA 92121 USA
[3] Duke Univ, Durham, NC 27706 USA
基金
中国国家自然科学基金;
关键词
low-rank; decomposition; acceleration; pruning;
D O I
10.1109/EMC2-NIPS53020.2019.00011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low-rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. Networks trained with TRP has a low-rank structure in nature, and is approximated with negligible performance loss, thus eliminating fine-tuning after low rank approximation. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression counterparts using low rank approximation.
引用
收藏
页码:14 / 17
页数:4
相关论文
共 50 条
  • [1] TRP: Trained Rank Pruning for Efficient Deep Neural Networks
    Xu, Yuhui
    Li, Yuxi
    Zhang, Shuai
    Wen, Wei
    Wang, Botao
    Qi, Yingyong
    Chen, Yiran
    Lin, Weiyao
    Xiong, Hongkai
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 977 - 983
  • [2] Holistic Filter Pruning for Efficient Deep Neural Networks
    Enderich, Lukas
    Timm, Fabian
    Burgard, Wolfram
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2595 - 2604
  • [3] Compression of Deep Neural Networks by combining pruning and low rank decomposition
    Goyal, Saurabh
    Choudhury, Anamitra Roy
    Sharma, Vivek
    [J]. 2019 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2019, : 952 - 958
  • [4] Efficient Distributed Inference of Deep Neural Networks via Restructuring and Pruning
    Abdi, Afshin
    Rashidi, Saeed
    Fekri, Faramarz
    Krishna, Tushar
    [J]. THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 6, 2023, : 6640 - 6648
  • [5] An efficient pruning scheme of deep neural networks for Internet of Things applications
    Qi, Chen
    Shen, Shibo
    Li, Rongpeng
    Zhao, Zhifeng
    Liu, Qing
    Liang, Jing
    Zhang, Honggang
    [J]. EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2021, 2021 (01)
  • [6] Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks
    Hacene, Ghouthi Boukli
    Gripon, Vincent
    Arzel, Matthieu
    Farrugia, Nicolas
    Bengio, Yoshua
    [J]. 2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20), 2020, : 206 - 209
  • [7] Filter Pruning for Efficient Transfer Learning in Deep Convolutional Neural Networks
    Reinhold, Caique
    Roisenberg, Mauro
    [J]. ARTIFICIAL INTELLIGENCEAND SOFT COMPUTING, PT I, 2019, 11508 : 191 - 202
  • [8] An efficient pruning scheme of deep neural networks for Internet of Things applications
    Chen Qi
    Shibo Shen
    Rongpeng Li
    Zhifeng Zhao
    Qing Liu
    Jing Liang
    Honggang Zhang
    [J]. EURASIP Journal on Advances in Signal Processing, 2021
  • [9] Methods for Pruning Deep Neural Networks
    Vadera, Sunil
    Ameen, Salem
    [J]. IEEE ACCESS, 2022, 10 : 63280 - 63300
  • [10] Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    [J]. Journal of Machine Learning Research, 2021, 22