Deep Learning Model Compression With Rank Reduction in Tensor Decomposition

被引：0

作者：

Dai, Wei ^{[1
,2
]}

Fan, Jicong ^{[1
,3
]}

Miao, Yiming ^{[1
,2
]}

Hwang, Kai ^{[1
,2
]}

机构：

[1] Chinese Univ Hong Kong, Sch Data Sci, Shenzhen 518172, Peoples R China

[2] Shenzhen Inst Artificial Intelligence & Robot Soc, Shenzhen 518129, Peoples R China

[3] Shenzhen Res Inst Big Data, Shenzhen 518172, Peoples R China

来源：

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS | 2025年 / 36卷 / 01期

基金：

中国国家自然科学基金;

关键词：

Tensors; Training; Matrix decomposition; Image coding; Computational modeling; Adaptation models; Deep learning; Deep learning (DL); low-rank decomposition; model compression; rank reduction (RR); NEURAL-NETWORKS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Large neural network models are hard to deploy on lightweight edge devices demanding large network bandwidth. In this article, we propose a novel deep learning (DL) model compression method. Specifically, we present a dual-model training strategy with an iterative and adaptive rank reduction (RR) in tensor decomposition. Our method regularizes the DL models while preserving model accuracy. With adaptive RR, the hyperparameter search space is significantly reduced. We provide a theoretical analysis of the convergence and complexity of the proposed method. Testing our method for the LeNet, VGG, ResNet, EfficientNet, and RevCol over MNIST, CIFAR-10/100, and ImageNet datasets, our method outperforms the baseline compression methods in both model compression and accuracy preservation. The experimental results validate our theoretical findings. For the VGG-16 on CIFAR-10 dataset, our compressed model has shown a 0.88% accuracy gain with 10.41 times storage reduction and 6.29 times speedup. For the ResNet-50 on ImageNet dataset, our compressed model results in 2.36 times storage reduction and 2.17 times speedup. In federated learning (FL) applications, our scheme reduces 13.96 times the communication overhead. In summary, our compressed DL method can improve the image understanding and pattern recognition processes significantly.

引用

页码：1315 / 1328

页数：14

共 50 条

[21] Tensor Decomposition for Model Reduction in Neural Networks: A Review
Liu, Xingyi
Parhi, Keshab K.
IEEE CIRCUITS AND SYSTEMS MAGAZINE, 2023, 23 (02) : 8 - 28
[22] Convolutional Neural Network Compression via Tensor-Train Decomposition on Permuted Weight Tensor with Automatic Rank Determination
Gabor, Mateusz
Zdunek, Rafal
COMPUTATIONAL SCIENCE - ICCS 2022, PT III, 2022, 13352 : 654 - 667
[23] Learning a deep convolutional neural network via tensor decomposition
Oymak, Samet
Soltanolkotabi, Mahdi
INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2021, 10 (03) : 1031 - 1071
[24] Learning Low-Rank Representations for Model Compression
Zhu, Zezhou
Dong, Yuan
Zhao, Zhong
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[25] A Framework for Data Representation, Processing, and Dimensionality Reduction with the Best-Rank Tensor Decomposition
Cyganek, Boguslaw
PROCEEDINGS OF THE ITI 2012 34TH INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY INTERFACES (ITI), 2012, : 325 - 330
[26] Tensor tree decomposition as a rank-reduction method for pre-stack interpolation
Manenti, Rafael
Sacchi, Mauricio D.
GEOPHYSICAL PROSPECTING, 2023, 71 (08) : 1404 - 1419
[27] Lightweight Deep Learning with Model Compression
Kang, U.
DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2021), PT III, 2021, 12683 : 662 - 663
[28] Parametric model order reduction based on parallel tensor compression
Li, Zhen
Jiang, Yao-Lin
Mu, Hong-liang
INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2021, 52 (11) : 2201 - 2216
[29] A Normal Form Algorithm for Tensor Rank Decomposition
Telen, Simon
Vannieuwenhoven, Nick
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2022, 48 (04):
[30] Adaptive Rank Selection for Tensor Ring Decomposition
Sedighin, Farnaz
Cichocki, Andrzej
Phan, Anh-Huy
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2021, 15 (03) : 454 - 463

← 1 2 3 4 5 →