SPARSE DEEP NEURAL NETWORKS USING L1,∞-WEIGHT NORMALIZATION

被引:3
|
作者
Wen, Ming [1 ]
Xu, Yixi [3 ]
Zheng, Yunling [2 ]
Yang, Zhouwang [1 ]
Wang, Xiao [3 ]
机构
[1] Univ Sci & Technol China, Sch Math Sci, Hefei, Peoples R China
[2] Univ Sci & Technol China, Sch Gifted Young, Hefei, Peoples R China
[3] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Deep neural networks; generalization; overfitting; rademarcher complexity; sparsity;
D O I
10.5705/ss.202018.0468
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Deep neural networks (DNNs) have recently demonstrated an excellent performance on many challenging tasks. However, overfitting remains a significant challenge in DNNs. Empirical evidence suggests that inducing sparsity can relieve overfitting, and that weight normalization can accelerate the algorithm convergence. In this study, we employ L-1,L-infinity weight normalization for DNNs with bias neurons to achieve a sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the L-1,L-infinity weight normalization. Furthermore, we show that the upper bounds are independent of the network width and the root k-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications for using such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. Finally, we present various experiments that validate our theory and demonstrate the effectiveness of the resulting approach.
引用
收藏
页码:1397 / 1414
页数:18
相关论文
共 50 条
  • [1] Transformed l1 regularization for learning sparse deep neural networks
    Ma, Rongrong
    Miao, Jianyu
    Niu, Lingfeng
    Zhang, Peng
    [J]. NEURAL NETWORKS, 2019, 119 : 286 - 298
  • [2] L1 NORMALIZATION IN NEURAL NETWORKS USING PARALLEL AUTOMATIC LEVEL CONTROL
    PERFETTI, R
    [J]. ELECTRONICS LETTERS, 1990, 26 (02) : 82 - 84
  • [3] Neural Networks with L1 Regularizer for Sparse Representation of Input Data
    Yu, Ju-dong
    Li, Feng
    Wu, Wei
    Wang, Jing
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SOFTWARE ENGINEERING (AISE 2014), 2014, : 437 - 440
  • [4] Compact Deep Neural Networks with l1,1 and l1,2 Regularization
    Ma, Rongrong
    Niu, Lingfeng
    [J]. 2018 18TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW), 2018, : 1248 - 1254
  • [5] Scaling-Based Weight Normalization for Deep Neural Networks
    Yuan, Qunyong
    Xiao, Nanfeng
    [J]. IEEE ACCESS, 2019, 7 : 7286 - 7295
  • [6] Centered Weight Normalization in Accelerating Training of Deep Neural Networks
    Huang, Lei
    Liu, Xianglong
    Liu, Yang
    Lang, Bo
    Tao, Dacheng
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2822 - 2830
  • [7] Prune Deep Neural Networks With the Modified L1/2 Penalty
    Chang, Jing
    Sha, Jin
    [J]. IEEE ACCESS, 2019, 7 : 2273 - 2280
  • [8] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
    Salimans, Tim
    Kingma, Diederik P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [9] L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks
    Wu, Shuang
    Li, Guoqi
    Deng, Lei
    Liu, Liu
    Wu, Dong
    Xie, Yuan
    Shi, Luping
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (07) : 2043 - 2051
  • [10] Sparse smooth group L0°L1/2 regularization method for convolutional neural networks
    Quasdane, Mohamed
    Ramchoun, Hassan
    Masrour, Tawfik
    [J]. KNOWLEDGE-BASED SYSTEMS, 2024, 284