SPARSE DEEP NEURAL NETWORKS USING L1,∞-WEIGHT NORMALIZATION

被引：3

作者：

Wen, Ming ^{[1
]}

Xu, Yixi ^{[3
]}

Zheng, Yunling ^{[2
]}

Yang, Zhouwang ^{[1
]}

Wang, Xiao ^{[3
]}

机构：

[1] Univ Sci & Technol China, Sch Math Sci, Hefei, Peoples R China

[2] Univ Sci & Technol China, Sch Gifted Young, Hefei, Peoples R China

[3] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA

来源：

STATISTICA SINICA | 2021年 / 31卷 / 03期

基金：

美国国家科学基金会;

关键词：

Deep neural networks; generalization; overfitting; rademarcher complexity; sparsity;

D O I：

10.5705/ss.202018.0468

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

Deep neural networks (DNNs) have recently demonstrated an excellent performance on many challenging tasks. However, overfitting remains a significant challenge in DNNs. Empirical evidence suggests that inducing sparsity can relieve overfitting, and that weight normalization can accelerate the algorithm convergence. In this study, we employ L-1,L-infinity weight normalization for DNNs with bias neurons to achieve a sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the L-1,L-infinity weight normalization. Furthermore, we show that the upper bounds are independent of the network width and the root k-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications for using such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. Finally, we present various experiments that validate our theory and demonstrate the effectiveness of the resulting approach.

引用

页码：1397 / 1414

页数：18

共 50 条

[1] Transformed l1 regularization for learning sparse deep neural networks
Ma, Rongrong
Miao, Jianyu
Niu, Lingfeng
Zhang, Peng
[J]. NEURAL NETWORKS, 2019, 119 : 286 - 298
[2] L1 NORMALIZATION IN NEURAL NETWORKS USING PARALLEL AUTOMATIC LEVEL CONTROL
PERFETTI, R
[J]. ELECTRONICS LETTERS, 1990, 26 (02) : 82 - 84
[3] Neural Networks with L1 Regularizer for Sparse Representation of Input Data
Yu, Ju-dong
Li, Feng
Wu, Wei
Wang, Jing
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 437 - 440
[4] Compact Deep Neural Networks with l1,1 and l1,2 Regularization
Ma, Rongrong
Niu, Lingfeng
[J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 1248 - 1254
[5] Scaling-Based Weight Normalization for Deep Neural Networks
Yuan, Qunyong
Xiao, Nanfeng
[J]. IEEE ACCESS, 2019, 7 : 7286 - 7295
[6] Centered Weight Normalization in Accelerating Training of Deep Neural Networks
Huang, Lei
Liu, Xianglong
Liu, Yang
Lang, Bo
Tao, Dacheng
[J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2822 - 2830
[7] Prune Deep Neural Networks With the Modified L1/2 Penalty
Chang, Jing
Sha, Jin
[J]. IEEE ACCESS, 2019, 7 : 2273 - 2280
[8] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Salimans, Tim
Kingma, Diederik P.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[9] L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks
Wu, Shuang
Li, Guoqi
Deng, Lei
Liu, Liu
Wu, Dong
Xie, Yuan
Shi, Luping
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) : 2043 - 2051
[10] Sparse smooth group L0°L1/2 regularization method for convolutional neural networks
Quasdane, Mohamed
Ramchoun, Hassan
Masrour, Tawfik
[J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284

← 1 2 3 4 5 →