Centered Weight Normalization in Accelerating Training of Deep Neural Networks

被引:34
|
作者
Huang, Lei [1 ]
Liu, Xianglong [1 ]
Liu, Yang [1 ]
Lang, Bo [1 ]
Tao, Dacheng [2 ]
机构
[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China
[2] Univ Sydney, FEIT, Sch IT, UBTECH Sydney AI Ctr, Sydney, NSW, Australia
关键词
D O I
10.1109/ICCV.2017.305
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Training deep neural networks is difficult for the pathological curvature problem. Re-parameterization is an effective way to relieve the problem by learning the curvature approximately or constraining the solutions of weights with good properties for optimization. This paper proposes to re-parameterize the input weight of each neuron in deep neural networks by normalizing it with zero-mean and unit-norm, followed by a learnable scalar parameter to adjust the norm of the weight. This technique effectively stabilizes the distribution implicitly. Besides, it improves the conditioning of the optimization problem and thus accelerates the training of deep neural networks. It can be wrapped as a linear module in practice and plugged in any architecture to replace the standard linear module. We highlight the benefits of our method on both multi-layer perceptrons and convolutional neural networks, and demonstrate its scalability and efficiency on SVHN, CIFAR-10, CIFAR-100 and ImageNet datasets.
引用
收藏
页码:2822 / 2830
页数:9
相关论文
共 50 条
  • [1] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
    Salimans, Tim
    Kingma, Diederik P.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [2] Is normalization indispensable for training deep neural networks?
    Shao, Jie
    Hu, Kai
    Wang, Changhu
    Xue, Xiangyang
    Raj, Bhiksha
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [3] Accelerating Spectral Normalization for Enhancing Robustness of Deep Neural Networks
    Pan, Zhixin
    Mishra, Prabhat
    [J]. 2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 260 - 265
  • [4] Generalized Batch Normalization: Towards Accelerating Deep Neural Networks
    Yuan, Xiaoyong
    Feng, Zheng
    Norton, Matthew
    Li, Xiaolin
    [J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1682 - 1689
  • [5] Structure injected weight normalization for training deep networks
    Xu Yuan
    Xiangjun Shen
    Sumet Mehta
    Teng Li
    Shiming Ge
    Zhengjun Zha
    [J]. Multimedia Systems, 2022, 28 : 433 - 444
  • [6] Structure injected weight normalization for training deep networks
    Yuan, Xu
    Shen, Xiangjun
    Mehta, Sumet
    Li, Teng
    Ge, Shiming
    Zha, Zhengjun
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (02) : 433 - 444
  • [7] Accelerating Training for Distributed Deep Neural Networks in MapReduce
    Xu, Jie
    Wang, Jingyu
    Qi, Qi
    Sun, Haifeng
    Liao, Jianxin
    [J]. WEB SERVICES - ICWS 2018, 2018, 10966 : 181 - 195
  • [8] Scaling-Based Weight Normalization for Deep Neural Networks
    Yuan, Qunyong
    Xiao, Nanfeng
    [J]. IEEE ACCESS, 2019, 7 : 7286 - 7295
  • [9] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
    Li, Shengwei
    Lai, Zhiquan
    Li, Dongsheng
    Zhang, Yiming
    Ye, Xiangyu
    Duan, Yabo
    [J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
  • [10] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
    Dey, Sourya
    Shao, Yinan
    Chugg, Keith M.
    Beerel, Peter A.
    [J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280