SPARSE DEEP NEURAL NETWORKS USING L1,∞-WEIGHT NORMALIZATION

被引：3

作者：

Wen, Ming ^{[1
]}

Xu, Yixi ^{[3
]}

Zheng, Yunling ^{[2
]}

Yang, Zhouwang ^{[1
]}

Wang, Xiao ^{[3
]}

机构：

[1] Univ Sci & Technol China, Sch Math Sci, Hefei, Peoples R China

[2] Univ Sci & Technol China, Sch Gifted Young, Hefei, Peoples R China

[3] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA

来源：

STATISTICA SINICA | 2021年 / 31卷 / 03期

基金：

美国国家科学基金会;

关键词：

Deep neural networks; generalization; overfitting; rademarcher complexity; sparsity;

D O I：

10.5705/ss.202018.0468

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Deep neural networks (DNNs) have recently demonstrated an excellent performance on many challenging tasks. However, overfitting remains a significant challenge in DNNs. Empirical evidence suggests that inducing sparsity can relieve overfitting, and that weight normalization can accelerate the algorithm convergence. In this study, we employ L-1,L-infinity weight normalization for DNNs with bias neurons to achieve a sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the L-1,L-infinity weight normalization. Furthermore, we show that the upper bounds are independent of the network width and the root k-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications for using such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. Finally, we present various experiments that validate our theory and demonstrate the effectiveness of the resulting approach.

引用

页码：1397 / 1414

页数：18

共 50 条

[31] Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
Boo, Yoonho
Sung, Wonyong
[J]. 2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
[32] Structure injected weight normalization for training deep networks
Xu Yuan
Xiangjun Shen
Sumet Mehta
Teng Li
Shiming Ge
Zhengjun Zha
[J]. Multimedia Systems, 2022, 28 : 433 - 444
[33] Structure injected weight normalization for training deep networks
Yuan, Xu
Shen, Xiangjun
Mehta, Sumet
Li, Teng
Ge, Shiming
Zha, Zhengjun
[J]. MULTIMEDIA SYSTEMS, 2022, 28 (02) : 433 - 444
[34] Sparse Depth Map Interpolation using Deep Convolutional Neural Networks
Makarov, Ilya
Korinevskaya, Alisa
Aliev, Vladimir
[J]. 2018 41ST INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2018, : 254 - 257
[35] Sparse Deep Neural Networks for Embedded Intelligence
Bi, Jia
Gunn, Steve R.
[J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 30 - 38
[36] Learning Sparse Patterns in Deep Neural Networks
Wen, Weijing
Yang, Fan
Su, Yangfeng
Zhou, Dian
Zeng, Xuan
[J]. 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
[37] Accelerating Sparse Deep Neural Networks on FPGAs
Huang, Sitao
Pearson, Carl
Nagi, Rakesh
Xiong, Jinjun
Chen, Deming
Hwu, Wen-mei
[J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
[38] DOA estimation using weighted L1 norm sparse model
College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin
150040, China
[J]. Harbin Gongcheng Daxue Xuebao, 1600, 4 (603-607):
[39] Group sparse regularization for deep neural networks
Scardapane, Simone
Comminiello, Danilo
Hussain, Amir
Uncini, Aurelio
[J]. NEUROCOMPUTING, 2017, 241 : 81 - 89
[40] Sparse synthesis regularization with deep neural networks
Obmann, Daniel
Schwab, Johannes
Haltmeier, Markus
[J]. 2019 13TH INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2019,

← 1 2 3 4 5 →