Enhance the Performance of Deep Neural Networks via L2 Regularization on the Input of Activations

被引：0

作者：

Guang Shi

Jiangshe Zhang

Huirong Li

Changpeng Wang

机构：

[1] Xi’an Jiaotong University,School of Mathematics and Statistics

[2] Shangluo University,Department of Mathematics and Computer Application

[3] Chang’an University,School of Mathematics and Information Science

来源：

Neural Processing Letters | 2019年 / 50卷

关键词：

Neural networks; ReLU; Saturation phenomenon; L2 regularization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Deep neural networks (DNNs) are witnessing increasing attention in machine learning. However, the information propagation is becoming increasingly difficult as the networks get deeper, which makes the optimization of DNN extremely hard. One reason of this difficulty is saturation of hidden units. In this paper, we propose a novel methodology named RegA to decrease the influences of saturation on ReLU-DNNs (DNNs with ReLU). Instead of changing the activation functions or the initialization strategy, our methodology explicitly encourage the pre-activation to be out of the saturation region. Specifically, we add an auxiliary objective induced by L2-norm of the pre-activation values to the optimization problem. The auxiliary objective could help to active more units and promote effective information propagation in ReLU-DNNs. By conducting experiments on several large-scale real datasets, we demonstrate better representations could be learned by using RegA and the method help ReLU-DNNs get better performance on convergence and accuracy.

引用

页码：57 / 75

页数：18

共 50 条

[21] DEEP NEURAL NETWORKS FOR ESTIMATING SPEECH MODEL ACTIVATIONS
Williamson, Donald S.
Wang, Yuxuan
Wang, DeLiang
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5113 - 5117
[22] Explaining the Behavior of Neuron Activations in Deep Neural Networks
Wang, Longwei
Wang, Chengfei
Li, Yupeng
Wang, Rui
AD HOC NETWORKS, 2021, 111
[23] Input Shaping via FIR L2 Preview Tracking
Bucher, Izhak
Mirkin, Leonid
Vered, Yoav
2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 5513 - 5518
[24] Structured Pruning for Deep Convolutional Neural Networks via Adaptive Sparsity Regularization
Shao, Tuanjie
Shin, Dongkun
2022 IEEE 46TH ANNUAL COMPUTERS, SOFTWARE, AND APPLICATIONS CONFERENCE (COMPSAC 2022), 2022, : 982 - 987
[25] Sparse portfolio optimization via l1 over l2 regularization
Wu, Zhongming
Sun, Kexin
Ge, Zhili
Allen-Zhao, Zhihua
Zeng, Tieyong
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 319 (03) : 820 - 833
[26] Transformed l1 regularization for learning sparse deep neural networks
Ma, Rongrong
Miao, Jianyu
Niu, Lingfeng
Zhang, Peng
NEURAL NETWORKS, 2019, 119 : 286 - 298
[27] Input Layer Regularization of Multilayer Feedforward Neural Networks
Li, Feng
Zurada, Jacek M.
Liu, Yan
Wu, Wei
IEEE ACCESS, 2017, 5 : 10979 - 10985
[28] Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks
Li, Kun
Qian, Xiaojun
Meng, Helen
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2017, 25 (01) : 193 - 207
[29] Towards Stochasticity of Regularization in Deep Neural Networks
Sandjakoska, Ljubinka
Bogdanova, Ana Madevska
2018 14TH SYMPOSIUM ON NEURAL NETWORKS AND APPLICATIONS (NEUREL), 2018,
[30] Regularization of deep neural networks with spectral dropout
Khan, Salman H.
Hayat, Munawar
Porikli, Fatih
NEURAL NETWORKS, 2019, 110 : 82 - 90

← 1 2 3 4 5 →