Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

被引:31
|
作者
Iiduka, Hideaki [1 ]
机构
[1] Meiji Univ, Dept Comp Sci, Tokyo, Kanagawa 2148571, Japan
基金
日本学术振兴会;
关键词
Optimization; Convergence; Stochastic processes; Deep learning; Approximation algorithms; Training; Heuristic algorithms; Adaptive mean square gradient (AMSGrad); adaptive moment estimation (Adam); adaptive-learning-rate optimization algorithm; deep neural network; learning rate; nonconvex stochastic optimization; SUBGRADIENT METHODS;
D O I
10.1109/TCYB.2021.3107415
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article deals with nonconvex stochastic optimization problems in deep learning. Appropriate learning rates, based on theory, for adaptive-learning-rate optimization algorithms (e.g., Adam and AMSGrad) to approximate the stationary points of such problems are provided. These rates are shown to allow faster convergence than previously reported for these algorithms. Specifically, the algorithms are examined in numerical experiments on text and image classification and are shown in experiments to perform better with constant learning rates than algorithms using diminishing learning rates.
引用
收藏
页码:13250 / 13261
页数:12
相关论文
共 50 条
  • [21] Modeling of aquifer vulnerability index using deep learning neural networks coupling with optimization algorithms
    Elzain, Hussam Eldin
    Chung, Sang Yong
    Senapathi, Venkatramanan
    Sekar, Selvam
    Park, Namsik
    Mahmoud, Ahmed Abdulhamid
    [J]. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH, 2021, 28 (40) : 57030 - 57045
  • [22] Rates of learning in gradient and genetic training of recurrent neural networks
    Riaza, R
    Zufiria, PJ
    [J]. ARTIFICIAL NEURAL NETS AND GENETIC ALGORITHMS, 1999, : 95 - 99
  • [23] Modeling of aquifer vulnerability index using deep learning neural networks coupling with optimization algorithms
    Hussam Eldin Elzain
    Sang Yong Chung
    Venkatramanan Senapathi
    Selvam Sekar
    Namsik Park
    Ahmed Abdulhamid Mahmoud
    [J]. Environmental Science and Pollution Research, 2021, 28 : 57030 - 57045
  • [24] RALR: Random Amplify Learning Rates for Training Neural Networks
    Deng, Jiali
    Gong, Haigang
    Liu, Minghui
    Xie, Tianshu
    Cheng, Xuan
    Wang, Xiaomin
    Liu, Ming
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (01):
  • [25] Optimization of evolutionary neural networks using hybrid learning algorithms
    Abraham, A
    [J]. PROCEEDING OF THE 2002 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1-3, 2002, : 2797 - 2802
  • [26] Fast Backpropagation Learning Using Optimization of Learning Rate for Pulsed Neural Networks
    Yamamoto, Kenji
    Koakutsu, Seiichi
    Okamoto, Takashi
    Hirata, Hironori
    [J]. ELECTRONICS AND COMMUNICATIONS IN JAPAN, 2011, 94 (07) : 27 - 34
  • [27] An adaptive learning rate for the training of B-spline networks
    Chan, CW
    Jin, H
    Cheung, KC
    Zhang, HY
    [J]. UKACC INTERNATIONAL CONFERENCE ON CONTROL '98, VOLS I&II, 1998, : 342 - 347
  • [28] Learning to Optimize: Training Deep Neural Networks for Interference Management
    Sun, Haoran
    Chen, Xiangyi
    Shi, Qingjiang
    Hong, Mingyi
    Fu, Xiao
    Sidiropoulos, Nicholas D.
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2018, 66 (20) : 5438 - 5453
  • [29] Learning Not to Learn: Training Deep Neural Networks with Biased Data
    Kim, Byungju
    Kim, Hyunwoo
    Kim, Kyungsu
    Kim, Sungjin
    Kim, Junmo
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9004 - 9012
  • [30] Performance Enhancement of Adaptive Neural Networks Based on Learning Rate
    Zubair, Swaleha
    Singha, Anjani Kumar
    Pathak, Nitish
    Sharma, Neelam
    Urooj, Shabana
    Larguech, Samia Rabeh
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 74 (01): : 2005 - 2019