Enhance the Performance of Deep Neural Networks via L2 Regularization on the Input of Activations

被引：0

作者：

Guang Shi

Jiangshe Zhang

Huirong Li

Changpeng Wang

机构：

[1] Xi’an Jiaotong University,School of Mathematics and Statistics

[2] Shangluo University,Department of Mathematics and Computer Application

[3] Chang’an University,School of Mathematics and Information Science

来源：

Neural Processing Letters | 2019年 / 50卷

关键词：

Neural networks; ReLU; Saturation phenomenon; L2 regularization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks (DNNs) are witnessing increasing attention in machine learning. However, the information propagation is becoming increasingly difficult as the networks get deeper, which makes the optimization of DNN extremely hard. One reason of this difficulty is saturation of hidden units. In this paper, we propose a novel methodology named RegA to decrease the influences of saturation on ReLU-DNNs (DNNs with ReLU). Instead of changing the activation functions or the initialization strategy, our methodology explicitly encourage the pre-activation to be out of the saturation region. Specifically, we add an auxiliary objective induced by L2-norm of the pre-activation values to the optimization problem. The auxiliary objective could help to active more units and promote effective information propagation in ReLU-DNNs. By conducting experiments on several large-scale real datasets, we demonstrate better representations could be learned by using RegA and the method help ReLU-DNNs get better performance on convergence and accuracy.

引用

页码：57 / 75

页数：18

共 50 条

[41] Deep Heterogeneous Graph Neural Networks via Similarity Regularization Loss and Hierarchical Fusion
Xiong, Zhilong
Cai, Jia
2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS, ICDMW, 2022, : 759 - 768
[42] Intonation classification for L2 English speech using multi-distribution deep neural networks
Li, Kun
Wu, Xixin
Meng, Helen
COMPUTER SPEECH AND LANGUAGE, 2017, 43 : 18 - 33
[43] An Analysis of the Regularization between L2 and Dropout in Single Hidden Layer Neural Network
Phaisangittisagul, Ekachai
2016 7TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS, MODELLING AND SIMULATION (ISMS), 2016, : 174 - 179
[44] Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TINYSCRIPT
Fu, Fangcheng
Hu, Yuzheng
He, Yihan
Jiang, Jiawei
Shao, Yingxia
Zhang, Ce
Cui, Bin
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[45] REGULARIZATION OF L2 NORMS OF LAGRANGIAN DISTRIBUTIONS
IZEN, S
TRANSACTIONS OF THE AMERICAN MATHEMATICAL SOCIETY, 1985, 288 (01) : 363 - 380
[46] Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TINYSCRIPT
Fu, Fangcheng
Hu, Yuzheng
He, Yihan
Jiang, Jiawei
Shao, Yingxia
Zhang, Ce
Cui, Bin
25TH AMERICAS CONFERENCE ON INFORMATION SYSTEMS (AMCIS 2019), 2019,
[47] Further results on L2 - L∞ state estimation of delayed neural networks
Qian, Wei
Chen, Yonggang
Liu, Yurong
Alsaadi, Fuad E.
NEUROCOMPUTING, 2018, 273 : 509 - 515
[48] Deep neural networks regularization for structured output prediction
Belharbi, Soufiane
Herault, Romain
Chatelain, Clement
Adam, Sebastien
NEUROCOMPUTING, 2018, 281 : 169 - 177
[49] Adaptive Knowledge Driven Regularization for Deep Neural Networks
Luo, Zhaojing
Cai, Shaofeng
Cui, Can
Ooi, Beng Chin
Yang, Yang
THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 8810 - 8818
[50] Regional Tree Regularization for Interpretability in Deep Neural Networks
Wu, Mike
Parbhoo, Sonali
Hughes, Michael C.
Kindle, Ryan
Celi, Leo
Zazzi, Maurizio
Roth, Volker
Doshi-Velez, Finale
THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 6413 - 6421

← 1 2 3 4 5 →