Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

被引:31
|
作者
Iiduka, Hideaki [1 ]
机构
[1] Meiji Univ, Dept Comp Sci, Tokyo, Kanagawa 2148571, Japan
基金
日本学术振兴会;
关键词
Optimization; Convergence; Stochastic processes; Deep learning; Approximation algorithms; Training; Heuristic algorithms; Adaptive mean square gradient (AMSGrad); adaptive moment estimation (Adam); adaptive-learning-rate optimization algorithm; deep neural network; learning rate; nonconvex stochastic optimization; SUBGRADIENT METHODS;
D O I
10.1109/TCYB.2021.3107415
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article deals with nonconvex stochastic optimization problems in deep learning. Appropriate learning rates, based on theory, for adaptive-learning-rate optimization algorithms (e.g., Adam and AMSGrad) to approximate the stationary points of such problems are provided. These rates are shown to allow faster convergence than previously reported for these algorithms. Specifically, the algorithms are examined in numerical experiments on text and image classification and are shown in experiments to perform better with constant learning rates than algorithms using diminishing learning rates.
引用
收藏
页码:13250 / 13261
页数:12
相关论文
共 50 条
  • [1] Adaptive Learning Rate and Momentum for Training Deep Neural Networks
    Hao, Zhiyong
    Jiang, Yixuan
    Yu, Huihua
    Chiang, Hsiao-Dong
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2021: RESEARCH TRACK, PT III, 2021, 12977 : 381 - 396
  • [2] Adaptive Learning Rate for Unsupervised Learning of Deep Neural Networks
    Golovko, Vladimir
    Mikhno, Egor
    Kroschanka, Aliaksandr
    Chodyka, Marta
    Lichograj, Piotr
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [3] A NONMONOTONE LEARNING RATE STRATEGY FOR SGD TRAINING OF DEEP NEURAL NETWORKS
    Keskar, Nitish Shirish
    Saon, George
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4974 - 4978
  • [4] The Optimization of Learning Rate for Neural Networks
    Huang, Weizhe
    Chen, Chi-Hua
    [J]. ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2023, 19 : 17 - 17
  • [5] An Adaptive Optimization Method Based on Learning Rate Schedule for Neural Networks
    Yi, Dokkyun
    Ji, Sangmin
    Park, Jieun
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (02): : 1 - 11
  • [6] LLR: Learning learning rates by LSTM for training neural networks
    Yu, Changyong
    Qi, Xin
    Ma, Haitao
    He, Xin
    Wang, Cuirong
    Zhao, Yuhai
    [J]. NEUROCOMPUTING, 2020, 394 : 41 - 50
  • [7] Cyclical Learning Rates for Training Neural Networks
    Smith, Leslie N.
    [J]. 2017 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2017), 2017, : 464 - 472
  • [8] CONSTRUCTIVE APPROACHES FOR TRAINING OF WAVELET NEURAL NETWORKS USING ADAPTIVE LEARNING RATE
    Skhiri, Mohamed Zine El Abidine
    Chtourou, Mohamed
    [J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2013, 11 (03)
  • [9] Demystifying Learning Rate Policies for High Accuracy Training of Deep Neural Networks
    Wu, Yanzhao
    Liu, Ling
    Bae, Juhyun
    Chow, Ka-Ho
    Iyengar, Arun
    Pu, Calton
    Wei, Wenqi
    Yu, Lei
    Zhang, Qi
    [J]. 2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 1971 - 1980
  • [10] Fixing the problems of deep neural networks will require better training data and learning algorithms
    Bowers, Jeffrey S.
    Malhotra, Gaurav
    Dujmovic, Marin
    Montero, Milton Llera
    Tsvetkov, Christian
    Biscione, Valerio
    Puebla, Guillermo
    Adolfi, Federico
    Hummel, John E.
    Heaton, Rachel F.
    Evans, Benjamin D.
    Mitchell, Jeffrey
    Blything, Ryan
    Anderson, Barton L.
    Storrs, Katherine R.
    Fleming, Roland W.
    Bever, Thomas G.
    Chomsky, Noam
    Fong, Sandiway
    Piattelli-Palmarini, Massimo
    Chandran, Keerthi S.
    Paul, Amrita Mukherjee
    Paul, Avijit
    Ghosh, Kuntal
    de Vries, Jelmer Philip
    Flachot, Alban
    Morimoto, Takuma
    Gegenfurtner, Karl R.
    DiCarlo, James J.
    Yamins, Daniel L. K.
    Ferguson, Michael E.
    Fedorenko, Evelina
    Bethge, Matthias
    Bonnen, Tyler
    Schrimpf, Martin
    German, Joseph Scott
    Jacobs, Robert A.
    Golan, Tal
    Taylor, JohnMark
    Schutt, Heiko
    Peters, Benjamin
    Sommers, Rowan P.
    Seeliger, Katja
    Doerig, Adrien
    Linton, Paul
    Konkle, Talia
    van Gerven, Marcel
    Kording, Konrad
    Richards, Blake
    Kietzmann, Tim C.
    [J]. BEHAVIORAL AND BRAIN SCIENCES, 2023, 46