Weight and Gradient Centralization in Deep Neural Networks

被引：4

作者：

Fuhl, Wolfgang ^{[1
]}

Kasneci, Enkelejda ^{[1
]}

机构：

[1] Univ Tubingen, Sand 14, D-72076 Tubingen, Germany

来源：

ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2021, PT IV | 2021年 / 12894卷

关键词：

Neural networks; Normalization; DNN; Deep neuronal networks; DESCENT;

D O I：

10.1007/978-3-030-86380-7_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage of the newer methods compared to the batch normalization is not only increased generalization, but also that these methods only have to be applied during training and, therefore, do not influence the running time during use. https://atreus.informatik.uni- tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FWeightAndGradientCentralization& mode=list.

引用

页码：227 / 239

页数：13

共 50 条

[1] On Centralization and Unitization of Batch Normalization for Deep ReLU Neural Networks
Fei, Wen
Dai, Wenrui
Li, Chenglin
Zou, Junni
Xiong, Hongkai
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2827 - 2841
[2] Weight normalized deep neural networks
Xu, Yixi
Wang, Xiao
[J]. STAT, 2021, 10 (01):
[3] Enhanced gradient learning for deep neural networks
Yan, Ming
Yang, Jianxi
Chen, Cen
Zhou, Joey Tianyi
Pan, Yi
Zeng, Zeng
[J]. IET IMAGE PROCESSING, 2022, 16 (02) : 365 - 377
[4] Adaptive Weight Decay for Deep Neural Networks
Nakamura, Kensuke
Hong, Byung-Woo
[J]. IEEE ACCESS, 2019, 7 : 118857 - 118865
[5] Exploring weight symmetry in deep neural networks
Hu, Shell Xu
Zagoruyko, Sergey
Komodakis, Nikos
[J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 187
[6] Hierarchical Weight Averaging for Deep Neural Networks
Gu, Xiaozhe
Zhang, Zixun
Jiang, Yuncheng
Luo, Tao
Zhang, Ruimao
Cui, Shuguang
Li, Zhen
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12276 - 12287
[7] Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
Li, Chunyuan
Chen, Changyou
Carlson, David
Carin, Lawrence
[J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1788 - 1794
[8] GRADUAL SURROGATE GRADIENT LEARNING IN DEEP SPIKING NEURAL NETWORKS
Chen, Yi
Zhang, Silin
Ren, Shiyu
Qu, Hong
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8927 - 8931
[9] Heterogeneous gradient computing optimization for scalable deep neural networks
Sergio Moreno-Álvarez
Mercedes E. Paoletti
Juan A. Rico-Gallego
Juan M. Haut
[J]. The Journal of Supercomputing, 2022, 78 : 13455 - 13469
[10] Learning dynamics of gradient descent optimization in deep neural networks
Wu, Wei
Jing, Xiaoyuan
Du, Wencai
Chen, Guoliang
[J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)

← 1 2 3 4 5 →