An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks

被引:0
|
作者
Kang Wang
Tao Sun
Yong Dou
机构
[1] National University of Defense Technology,The National Laboratory for Parallel and Distributed Processing, School of Computer
来源
Neural Processing Letters | 2022年 / 54卷
关键词
SIGNSGD optimizer; An adaptive learning rate strategy; Communication; Fast convergence; Neural networks;
D O I
暂无
中图分类号
学科分类号
摘要
SIGNSGD is able to dramatically improve the performance of training large neural networks by transmitting the sign of each minibatch stochastic gradient, which achieves gradient communication compression and keeps standard stochastic gradient descent (SGD) level convergence rate. Meanwhile, the learning rate plays a vital role in training neural networks, but existing learning rate optimization strategies mainly face the following problems: (1) for learning rate decay method, small learning rates produced lead to converge slowly, and extra hyper-parameters are required except for the initial learning rate, causing more human participation. (2) Adaptive gradient algorithms have poor generalization performance and also utilize other hyper-parameters. (3) Generating learning rates via two-level optimization models is difficult and time-consuming in training. To this end, we propose a novel adaptive learning rate schedule for neural network training via SIGNSGD optimizer for the first time. In our method, based on the theoretical inspiration that the convergence rate’s upper bound has minimization with the current learning rate in each iteration, the current learning rate can be expressed by a mathematical expression that is merely related to historical learning rates. Then, given an initial value, learning rates in different training stages can be adaptively obtained. Our proposed method has following advantages: (1) it is a novel automatic method without additional hyper-parameters except for one initial value, thus reducing the manual participation. (2) It has faster convergence rate and outperforms the standard SGD. (3) It makes neural networks achieve better performance with fewer gradient communication bits. Three numerical simulations are conducted on different neural networks with three public datasets: MNIST, Cifar-10 and Cifar-100 datasets, and several numerical results are presented to demonstrate the efficiency of our proposed approach.
引用
收藏
页码:803 / 816
页数:13
相关论文
共 50 条
  • [1] An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks
    Wang, Kang
    Sun, Tao
    Dou, Yong
    NEURAL PROCESSING LETTERS, 2022, 54 (02) : 803 - 816
  • [2] An Adaptive Optimization Method Based on Learning Rate Schedule for Neural Networks
    Yi, Dokkyun
    Ji, Sangmin
    Park, Jieun
    APPLIED SCIENCES-BASEL, 2021, 11 (02): : 1 - 11
  • [3] Adaptive Learning Rate for Unsupervised Learning of Deep Neural Networks
    Golovko, Vladimir
    Mikhno, Egor
    Kroschanka, Aliaksandr
    Chodyka, Marta
    Lichograj, Piotr
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [4] A Novel Learning Rate Schedule in Optimization for Neural Networks and It's Convergence
    Park, Jieun
    Yi, Dokkyun
    Ji, Sangmin
    SYMMETRY-BASEL, 2020, 12 (04):
  • [5] The Effect of Adaptive Learning Rate on the Accuracy of Neural Networks
    Jepkoech, Jennifer
    Mugo, David Muchangi
    Kenduiywo, Benson K.
    Too, Edna Chebet
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (08) : 736 - 751
  • [6] AutoLrOpt: An Efficient Optimizer Using Automatic Setting of Learning Rate for Deep Neural Networks
    Merrouchi, Mohamed
    Atifi, Khalid
    Skittou, Mustapha
    Benyoussef, Youssef
    Gadi, Taoufiq
    IEEE ACCESS, 2024, 12 : 83154 - 83168
  • [7] Performance Enhancement of Adaptive Neural Networks Based on Learning Rate
    Zubair, Swaleha
    Singha, Anjani Kumar
    Pathak, Nitish
    Sharma, Neelam
    Urooj, Shabana
    Larguech, Samia Rabeh
    CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 2005 - 2019
  • [8] A novel approach for implementation of adaptive learning rate neural networks
    Rezaie, MG
    Farbiz, F
    Moghaddam, EZ
    Hooshmand, A
    22ND NORCHIP CONFERENCE, PROCEEDINGS, 2004, : 79 - 82
  • [9] Adaptive Learning Rate and Momentum for Training Deep Neural Networks
    Hao, Zhiyong
    Jiang, Yixuan
    Yu, Huihua
    Chiang, Hsiao-Dong
    MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 : 381 - 396