Enhance the Performance of Deep Neural Networks via L2 Regularization on the Input of Activations

被引:0
|
作者
Guang Shi
Jiangshe Zhang
Huirong Li
Changpeng Wang
机构
[1] Xi’an Jiaotong University,School of Mathematics and Statistics
[2] Shangluo University,Department of Mathematics and Computer Application
[3] Chang’an University,School of Mathematics and Information Science
来源
Neural Processing Letters | 2019年 / 50卷
关键词
Neural networks; ReLU; Saturation phenomenon; L2 regularization;
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks (DNNs) are witnessing increasing attention in machine learning. However, the information propagation is becoming increasingly difficult as the networks get deeper, which makes the optimization of DNN extremely hard. One reason of this difficulty is saturation of hidden units. In this paper, we propose a novel methodology named RegA to decrease the influences of saturation on ReLU-DNNs (DNNs with ReLU). Instead of changing the activation functions or the initialization strategy, our methodology explicitly encourage the pre-activation to be out of the saturation region. Specifically, we add an auxiliary objective induced by L2-norm of the pre-activation values to the optimization problem. The auxiliary objective could help to active more units and promote effective information propagation in ReLU-DNNs. By conducting experiments on several large-scale real datasets, we demonstrate better representations could be learned by using RegA and the method help ReLU-DNNs get better performance on convergence and accuracy.
引用
收藏
页码:57 / 75
页数:18
相关论文
共 50 条
  • [41] Deep Heterogeneous Graph Neural Networks via Similarity Regularization Loss and Hierarchical Fusion
    Xiong, Zhilong
    Cai, Jia
    2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 759 - 768
  • [42] Intonation classification for L2 English speech using multi-distribution deep neural networks
    Li, Kun
    Wu, Xixin
    Meng, Helen
    COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 18 - 33
  • [43] An Analysis of the Regularization between L2 and Dropout in Single Hidden Layer Neural Network
    Phaisangittisagul, Ekachai
    2016 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION (ISMS), 2016, : 174 - 179
  • [44] Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TINYSCRIPT
    Fu, Fangcheng
    Hu, Yuzheng
    He, Yihan
    Jiang, Jiawei
    Shao, Yingxia
    Zhang, Ce
    Cui, Bin
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [46] Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TINYSCRIPT
    Fu, Fangcheng
    Hu, Yuzheng
    He, Yihan
    Jiang, Jiawei
    Shao, Yingxia
    Zhang, Ce
    Cui, Bin
    25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
  • [47] Further results on L2 - L∞ state estimation of delayed neural networks
    Qian, Wei
    Chen, Yonggang
    Liu, Yurong
    Alsaadi, Fuad E.
    NEUROCOMPUTING, 2018, 273 : 509 - 515
  • [48] Deep neural networks regularization for structured output prediction
    Belharbi, Soufiane
    Herault, Romain
    Chatelain, Clement
    Adam, Sebastien
    NEUROCOMPUTING, 2018, 281 : 169 - 177
  • [49] Adaptive Knowledge Driven Regularization for Deep Neural Networks
    Luo, Zhaojing
    Cai, Shaofeng
    Cui, Can
    Ooi, Beng Chin
    Yang, Yang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8810 - 8818
  • [50] Regional Tree Regularization for Interpretability in Deep Neural Networks
    Wu, Mike
    Parbhoo, Sonali
    Hughes, Michael C.
    Kindle, Ryan
    Celi, Leo
    Zazzi, Maurizio
    Roth, Volker
    Doshi-Velez, Finale
    THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6413 - 6421