SPARSE DEEP NEURAL NETWORKS USING L1,∞-WEIGHT NORMALIZATION

被引:3
|
作者
Wen, Ming [1 ]
Xu, Yixi [3 ]
Zheng, Yunling [2 ]
Yang, Zhouwang [1 ]
Wang, Xiao [3 ]
机构
[1] Univ Sci & Technol China, Sch Math Sci, Hefei, Peoples R China
[2] Univ Sci & Technol China, Sch Gifted Young, Hefei, Peoples R China
[3] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
基金
美国国家科学基金会;
关键词
Deep neural networks; generalization; overfitting; rademarcher complexity; sparsity;
D O I
10.5705/ss.202018.0468
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Deep neural networks (DNNs) have recently demonstrated an excellent performance on many challenging tasks. However, overfitting remains a significant challenge in DNNs. Empirical evidence suggests that inducing sparsity can relieve overfitting, and that weight normalization can accelerate the algorithm convergence. In this study, we employ L-1,L-infinity weight normalization for DNNs with bias neurons to achieve a sparse architecture. We theoretically establish the generalization error bounds for both regression and classification under the L-1,L-infinity weight normalization. Furthermore, we show that the upper bounds are independent of the network width and the root k-dependence on the network depth k, which are the best available bounds for networks with bias neurons. These results provide theoretical justifications for using such weight normalization to reduce the generalization error. We also develop an easily implemented gradient projection descent algorithm to practically obtain a sparse neural network. Finally, we present various experiments that validate our theory and demonstrate the effectiveness of the resulting approach.
引用
收藏
页码:1397 / 1414
页数:18
相关论文
共 50 条
  • [31] Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations
    Boo, Yoonho
    Sung, Wonyong
    [J]. 2017 IEEE INTERNATIONAL WORKSHOP ON SIGNAL PROCESSING SYSTEMS (SIPS), 2017,
  • [32] Structure injected weight normalization for training deep networks
    Xu Yuan
    Xiangjun Shen
    Sumet Mehta
    Teng Li
    Shiming Ge
    Zhengjun Zha
    [J]. Multimedia Systems, 2022, 28 : 433 - 444
  • [33] Structure injected weight normalization for training deep networks
    Yuan, Xu
    Shen, Xiangjun
    Mehta, Sumet
    Li, Teng
    Ge, Shiming
    Zha, Zhengjun
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (02) : 433 - 444
  • [34] Sparse Depth Map Interpolation using Deep Convolutional Neural Networks
    Makarov, Ilya
    Korinevskaya, Alisa
    Aliev, Vladimir
    [J]. 2018 41ST INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING (TSP), 2018, : 254 - 257
  • [35] Sparse Deep Neural Networks for Embedded Intelligence
    Bi, Jia
    Gunn, Steve R.
    [J]. 2018 IEEE 30TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI), 2018, : 30 - 38
  • [36] Learning Sparse Patterns in Deep Neural Networks
    Wen, Weijing
    Yang, Fan
    Su, Yangfeng
    Zhou, Dian
    Zeng, Xuan
    [J]. 2019 IEEE 13TH INTERNATIONAL CONFERENCE ON ASIC (ASICON), 2019,
  • [37] Accelerating Sparse Deep Neural Networks on FPGAs
    Huang, Sitao
    Pearson, Carl
    Nagi, Rakesh
    Xiong, Jinjun
    Chen, Deming
    Hwu, Wen-mei
    [J]. 2019 IEEE HIGH PERFORMANCE EXTREME COMPUTING CONFERENCE (HPEC), 2019,
  • [38] DOA estimation using weighted L1 norm sparse model
    College of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin
    150040, China
    [J]. Harbin Gongcheng Daxue Xuebao, 1600, 4 (603-607):
  • [39] Group sparse regularization for deep neural networks
    Scardapane, Simone
    Comminiello, Danilo
    Hussain, Amir
    Uncini, Aurelio
    [J]. NEUROCOMPUTING, 2017, 241 : 81 - 89
  • [40] Sparse synthesis regularization with deep neural networks
    Obmann, Daniel
    Schwab, Johannes
    Haltmeier, Markus
    [J]. 2019 13TH INTERNATIONAL CONFERENCE ON SAMPLING THEORY AND APPLICATIONS (SAMPTA), 2019,