Deep Learning Model Compression With Rank Reduction in Tensor Decomposition

被引:0
|
作者
Dai, Wei [1 ,2 ]
Fan, Jicong [1 ,3 ]
Miao, Yiming [1 ,2 ]
Hwang, Kai [1 ,2 ]
机构
[1] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen 518172, Peoples R China
[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518129, Peoples R China
[3] Shenzhen Res Inst Big Data, Shenzhen 518172, Peoples R China
基金
中国国家自然科学基金;
关键词
Tensors; Training; Matrix decomposition; Image coding; Computational modeling; Adaptation models; Deep learning; Deep learning (DL); low-rank decomposition; model compression; rank reduction (RR); NEURAL-NETWORKS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large neural network models are hard to deploy on lightweight edge devices demanding large network bandwidth. In this article, we propose a novel deep learning (DL) model compression method. Specifically, we present a dual-model training strategy with an iterative and adaptive rank reduction (RR) in tensor decomposition. Our method regularizes the DL models while preserving model accuracy. With adaptive RR, the hyperparameter search space is significantly reduced. We provide a theoretical analysis of the convergence and complexity of the proposed method. Testing our method for the LeNet, VGG, ResNet, EfficientNet, and RevCol over MNIST, CIFAR-10/100, and ImageNet datasets, our method outperforms the baseline compression methods in both model compression and accuracy preservation. The experimental results validate our theoretical findings. For the VGG-16 on CIFAR-10 dataset, our compressed model has shown a 0.88% accuracy gain with 10.41 times storage reduction and 6.29 times speedup. For the ResNet-50 on ImageNet dataset, our compressed model results in 2.36 times storage reduction and 2.17 times speedup. In federated learning (FL) applications, our scheme reduces 13.96 times the communication overhead. In summary, our compressed DL method can improve the image understanding and pattern recognition processes significantly.
引用
收藏
页码:1315 / 1328
页数:14
相关论文
共 50 条
  • [41] Rank minimization on tensor ring: an efficient approach for tensor decomposition and completion
    Longhao Yuan
    Chao Li
    Jianting Cao
    Qibin Zhao
    Machine Learning, 2020, 109 : 603 - 622
  • [42] A Hybrid Deep Model for Learning to Rank Data Tables
    Trabelsi, Mohamed
    Chen, Zhiyu
    Davison, Brian D.
    Heflin, Jeff
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 979 - 986
  • [43] Combining deep learning model compression techniques
    Santos Silva, Jose Vitor
    Matos Matos, Leonardo
    Santos, Flavio
    Magalhaes Cerqueira, Helisson Oliveira
    Macedo, Hendrik
    Piedade Prado, Bruno Otavio
    Ferreira da Silva, Gilton Jose
    Bispo, Kalil Araujo
    IEEE LATIN AMERICA TRANSACTIONS, 2022, 20 (03) : 458 - 464
  • [44] Survey of Deep Learning Model Compression and Acceleration
    Gao H.
    Tian Y.-L.
    Xu F.-Y.
    Zhong S.
    Ruan Jian Xue Bao/Journal of Software, 2021, 32 (01): : 68 - 92
  • [45] A Novel Deep Learning Model Compression Algorithm
    Zhao, Ming
    Li, Meng
    Peng, Sheng-Lung
    Li, Jie
    ELECTRONICS, 2022, 11 (07)
  • [46] Nonnegative Tensor Completion via Low-Rank Tucker Decomposition: Model and Algorithm
    Chen, Bilian
    Sun, Ting
    Zhou, Zhehao
    Zeng, Yifeng
    Cao, Langcai
    IEEE ACCESS, 2019, 7 : 95903 - 95914
  • [47] Low-rank approximation-based tensor decomposition model for subspace clustering
    Su, Yuting
    Bai, Xu
    Jian, Pu
    Jing, Peiguang
    Zhang, Jing
    ELECTRONICS LETTERS, 2019, 55 (07) : 406 - +
  • [48] Turbulence model reduction by deep learning
    Heinonen, R. A.
    Diamond, P. H.
    PHYSICAL REVIEW E, 2020, 101 (06)
  • [49] Rank order polynomial decomposition for image compression
    Egger, O
    Gruter, R
    Vesin, JM
    Kunt, M
    PROCEEDINGS OF THE 1998 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-6, 1998, : 2641 - 2644
  • [50] Tensor rank reduction via coordinate flows
    Dektor, Alec
    Venturi, Daniele
    JOURNAL OF COMPUTATIONAL PHYSICS, 2023, 491