Trained Rank Pruning for Efficient Deep Neural Networks

被引:11
|
作者
Xu, Yuhui [1 ]
Li, Yuxi [1 ]
Zhang, Shuai [2 ]
Wen, Wei [3 ]
Wang, Botao [2 ]
Dai, Wenrui [1 ]
Qi, Yingyong [2 ]
Chen, Yiran [3 ]
Lin, Weiyao [1 ]
Xiong, Hongkai [1 ]
机构
[1] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[2] Qualcomm AI Res, San Diego, CA 92121 USA
[3] Duke Univ, Durham, NC 27706 USA
基金
中国国家自然科学基金;
关键词
low-rank; decomposition; acceleration; pruning;
D O I
10.1109/EMC2-NIPS53020.2019.00011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposition; however, small approximation errors in parameters can ripple over a large prediction loss. Apparently, it is not optimal to separate low-rank approximation from training. Unlike previous works, this paper integrates low-rank approximation and regularization into the training process. We propose Trained Rank Pruning (TRP), which alternates between low rank approximation and training. TRP maintains the capacity of the original network while imposing low-rank constraints during training. A nuclear regularization optimized by stochastic sub-gradient descent is utilized to further promote low rank in TRP. Networks trained with TRP has a low-rank structure in nature, and is approximated with negligible performance loss, thus eliminating fine-tuning after low rank approximation. The proposed method is comprehensively evaluated on CIFAR-10 and ImageNet, outperforming previous compression counterparts using low rank approximation.
引用
收藏
页码:14 / 17
页数:4
相关论文
共 50 条
  • [31] Anonymous Model Pruning for Compressing Deep Neural Networks
    Zhang, Lechun
    Chen, Guangyao
    Shi, Yemin
    Zhang, Quan
    Tan, Mingkui
    Wang, Yaowei
    Tian, Yonghong
    Huang, Tiejun
    [J]. THIRD INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2020), 2020, : 161 - 164
  • [32] CUP: Cluster Pruning for Compressing Deep Neural Networks
    Duggal, Rahul
    Xiao, Cao
    Vuduc, Richard
    Duen Horng Chau
    Sun, Jimeng
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2021, : 5102 - 5106
  • [33] Pruning Deep Neural Networks by Optimal Brain Damage
    Liu, Chao
    Zhang, Zhiyong
    Wang, Dong
    [J]. 15TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2014), VOLS 1-4, 2014, : 1092 - 1095
  • [34] Structured Pruning for Deep Convolutional Neural Networks: A Survey
    He, Yang
    Xiao, Lingao
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 2900 - 2919
  • [35] Class-dependent Pruning of Deep Neural Networks
    Entezari, Rahim
    Saukh, Olga
    [J]. 2020 IEEE SECOND WORKSHOP ON MACHINE LEARNING ON EDGE IN SENSOR SYSTEMS (SENSYS-ML 2020), 2020, : 13 - 18
  • [36] Conditional Automated Channel Pruning for Deep Neural Networks
    Liu, Yixin
    Guo, Yong
    Guo, Jiaxin
    Jiang, Luoqian
    Chen, Jian
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1275 - 1279
  • [37] On the Information of Feature Maps and Pruning of Deep Neural Networks
    Soltani, Mohammadreza
    Wu, Suya
    Ding, Jie
    Ravier, Robert
    Tarokh, Vahid
    [J]. 2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 6988 - 6995
  • [38] Self-distilled Pruning of Deep Neural Networks
    Neill, James O'
    Dutta, Sourav
    Assem, Haytham
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT II, 2023, 13714 : 655 - 670
  • [39] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406
  • [40] Fast and Efficient Deep Sparse Multi-Strength Spiking Neural Networks with Dynamic Pruning
    Chen, Ruizhi
    Ma, Hong
    Xie, Shaolin
    Guo, Peng
    Li, Pin
    Wang, Donglin
    [J]. 2018 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2018, : 412 - 419