Weight and Gradient Centralization in Deep Neural Networks

被引:4
|
作者
Fuhl, Wolfgang [1 ]
Kasneci, Enkelejda [1 ]
机构
[1] Univ Tubingen, Sand 14, D-72076 Tubingen, Germany
关键词
Neural networks; Normalization; DNN; Deep neuronal networks; DESCENT;
D O I
10.1007/978-3-030-86380-7_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Batch normalization is currently the most widely used variant of internal normalization for deep neural networks. Additional work has shown that the normalization of weights and additional conditioning as well as the normalization of gradients further improve the generalization. In this work, we combine several of these methods and thereby increase the generalization of the networks. The advantage of the newer methods compared to the batch normalization is not only increased generalization, but also that these methods only have to be applied during training and, therefore, do not influence the running time during use. https://atreus.informatik.uni- tuebingen.de/seafile/d/8e2ab8c3fdd444e1a135/?p=%2FWeightAndGradientCentralization& mode=list.
引用
收藏
页码:227 / 239
页数:13
相关论文
共 50 条
  • [1] On Centralization and Unitization of Batch Normalization for Deep ReLU Neural Networks
    Fei, Wen
    Dai, Wenrui
    Li, Chenglin
    Zou, Junni
    Xiong, Hongkai
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2827 - 2841
  • [2] Weight normalized deep neural networks
    Xu, Yixi
    Wang, Xiao
    [J]. STAT, 2021, 10 (01):
  • [3] Enhanced gradient learning for deep neural networks
    Yan, Ming
    Yang, Jianxi
    Chen, Cen
    Zhou, Joey Tianyi
    Pan, Yi
    Zeng, Zeng
    [J]. IET IMAGE PROCESSING, 2022, 16 (02) : 365 - 377
  • [4] Adaptive Weight Decay for Deep Neural Networks
    Nakamura, Kensuke
    Hong, Byung-Woo
    [J]. IEEE ACCESS, 2019, 7 : 118857 - 118865
  • [5] Exploring weight symmetry in deep neural networks
    Hu, Shell Xu
    Zagoruyko, Sergey
    Komodakis, Nikos
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2019, 187
  • [6] Hierarchical Weight Averaging for Deep Neural Networks
    Gu, Xiaozhe
    Zhang, Zixun
    Jiang, Yuncheng
    Luo, Tao
    Zhang, Ruimao
    Cui, Shuguang
    Li, Zhen
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12276 - 12287
  • [7] Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
    Li, Chunyuan
    Chen, Changyou
    Carlson, David
    Carin, Lawrence
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1788 - 1794
  • [8] GRADUAL SURROGATE GRADIENT LEARNING IN DEEP SPIKING NEURAL NETWORKS
    Chen, Yi
    Zhang, Silin
    Ren, Shiyu
    Qu, Hong
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 8927 - 8931
  • [9] Heterogeneous gradient computing optimization for scalable deep neural networks
    Sergio Moreno-Álvarez
    Mercedes E. Paoletti
    Juan A. Rico-Gallego
    Juan M. Haut
    [J]. The Journal of Supercomputing, 2022, 78 : 13455 - 13469
  • [10] Learning dynamics of gradient descent optimization in deep neural networks
    Wu, Wei
    Jing, Xiaoyuan
    Du, Wencai
    Chen, Guoliang
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (05)