Centered Weight Normalization in Accelerating Training of Deep Neural Networks

被引：34

作者：

Huang, Lei ^{[1
]}

Liu, Xianglong ^{[1
]}

Liu, Yang ^{[1
]}

Lang, Bo ^{[1
]}

Tao, Dacheng ^{[2
]}

机构：

[1] Beihang Univ, State Key Lab Software Dev Environm, Beijing, Peoples R China

[2] Univ Sydney, FEIT, Sch IT, UBTECH Sydney AI Ctr, Sydney, NSW, Australia

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV) | 2017年

关键词：

D O I：

10.1109/ICCV.2017.305

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Training deep neural networks is difficult for the pathological curvature problem. Re-parameterization is an effective way to relieve the problem by learning the curvature approximately or constraining the solutions of weights with good properties for optimization. This paper proposes to re-parameterize the input weight of each neuron in deep neural networks by normalizing it with zero-mean and unit-norm, followed by a learnable scalar parameter to adjust the norm of the weight. This technique effectively stabilizes the distribution implicitly. Besides, it improves the conditioning of the optimization problem and thus accelerates the training of deep neural networks. It can be wrapped as a linear module in practice and plugged in any architecture to replace the standard linear module. We highlight the benefits of our method on both multi-layer perceptrons and convolutional neural networks, and demonstrate its scalability and efficiency on SVHN, CIFAR-10, CIFAR-100 and ImageNet datasets.

引用

页码：2822 / 2830

页数：9

共 50 条

[1] Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks
Salimans, Tim
Kingma, Diederik P.
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
[2] Is normalization indispensable for training deep neural networks?
Shao, Jie
Hu, Kai
Wang, Changhu
Xue, Xiangyang
Raj, Bhiksha
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
[3] Accelerating Spectral Normalization for Enhancing Robustness of Deep Neural Networks
Pan, Zhixin
Mishra, Prabhat
[J]. 2021 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2021), 2021, : 260 - 265
[4] Generalized Batch Normalization: Towards Accelerating Deep Neural Networks
Yuan, Xiaoyong
Feng, Zheng
Norton, Matthew
Li, Xiaolin
[J]. THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 1682 - 1689
[5] Structure injected weight normalization for training deep networks
Xu Yuan
Xiangjun Shen
Sumet Mehta
Teng Li
Shiming Ge
Zhengjun Zha
[J]. Multimedia Systems, 2022, 28 : 433 - 444
[6] Structure injected weight normalization for training deep networks
Yuan, Xu
Shen, Xiangjun
Mehta, Sumet
Li, Teng
Ge, Shiming
Zha, Zhengjun
[J]. MULTIMEDIA SYSTEMS, 2022, 28 (02) : 433 - 444
[7] Accelerating Training for Distributed Deep Neural Networks in MapReduce
Xu, Jie
Wang, Jingyu
Qi, Qi
Sun, Haifeng
Liao, Jianxin
[J]. WEB SERVICES - ICWS 2018, 2018, 10966 : 181 - 195
[8] Scaling-Based Weight Normalization for Deep Neural Networks
Yuan, Qunyong
Xiao, Nanfeng
[J]. IEEE ACCESS, 2019, 7 : 7286 - 7295
[9] EmbRace: Accelerating Sparse Communication for Distributed Training of Deep Neural Networks
Li, Shengwei
Lai, Zhiquan
Li, Dongsheng
Zhang, Yiming
Ye, Xiangyu
Duan, Yabo
[J]. 51ST INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, ICPP 2022, 2022,
[10] Accelerating Training of Deep Neural Networks via Sparse Edge Processing
Dey, Sourya
Shao, Yinan
Chugg, Keith M.
Beerel, Peter A.
[J]. ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2017, PT I, 2017, 10613 : 273 - 280

← 1 2 3 4 5 →