Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

被引:31
|
作者
Iiduka, Hideaki [1 ]
机构
[1] Meiji Univ, Dept Comp Sci, Tokyo, Kanagawa 2148571, Japan
基金
日本学术振兴会;
关键词
Optimization; Convergence; Stochastic processes; Deep learning; Approximation algorithms; Training; Heuristic algorithms; Adaptive mean square gradient (AMSGrad); adaptive moment estimation (Adam); adaptive-learning-rate optimization algorithm; deep neural network; learning rate; nonconvex stochastic optimization; SUBGRADIENT METHODS;
D O I
10.1109/TCYB.2021.3107415
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article deals with nonconvex stochastic optimization problems in deep learning. Appropriate learning rates, based on theory, for adaptive-learning-rate optimization algorithms (e.g., Adam and AMSGrad) to approximate the stationary points of such problems are provided. These rates are shown to allow faster convergence than previously reported for these algorithms. Specifically, the algorithms are examined in numerical experiments on text and image classification and are shown in experiments to perform better with constant learning rates than algorithms using diminishing learning rates.
引用
收藏
页码:13250 / 13261
页数:12
相关论文
共 50 条
  • [31] A novel approach for implementation of adaptive learning rate neural networks
    Rezaie, MG
    Farbiz, F
    Moghaddam, EZ
    Hooshmand, A
    [J]. 22ND NORCHIP CONFERENCE, PROCEEDINGS, 2004, : 79 - 82
  • [32] An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks
    Kang Wang
    Tao Sun
    Yong Dou
    [J]. Neural Processing Letters, 2022, 54 : 803 - 816
  • [33] An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks
    Wang, Kang
    Sun, Tao
    Dou, Yong
    [J]. NEURAL PROCESSING LETTERS, 2022, 54 (02) : 803 - 816
  • [34] Effective neural network training with adaptive learning rate based on training loss
    Takase, Tomoumi
    Oyama, Satoshi
    Kurihara, Masahito
    [J]. NEURAL NETWORKS, 2018, 101 : 68 - 78
  • [35] Dynamic random distribution learning rate for neural networks training
    Hu, Xueheng
    Wen, Shuhuan
    Lam, H. K.
    [J]. APPLIED SOFT COMPUTING, 2022, 124
  • [36] Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review
    Kaveh, Mehrdad
    Mesgari, Mohammad Saadi
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4519 - 4622
  • [37] Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review
    Mehrdad Kaveh
    Mohammad Saadi Mesgari
    [J]. Neural Processing Letters, 2023, 55 : 4519 - 4622
  • [38] Unsaturated MLP Neural Networks Training Algorithm using a Piecewise Error Function and Adaptive Learning Rates
    Moallem, Payman
    Ayoughi, S. Arvin
    [J]. 2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 55 - 60
  • [39] DYNAMIC SECURITY RULE OPTIMIZATION BASED ON DEEP LEARNING AND ADAPTIVE ALGORITHMS
    HANG, FEILU
    XIE, LINJIANG
    ZHANG, ZHENHONG
    LIU, YUTING
    HU, J.I.A.N.
    [J]. Scalable Computing, 2024, 25 (04): : 2603 - 2613
  • [40] Adaptive algorithms for neural network supervised learning: A deterministic optimization approach
    Magoulas, George D.
    Vrahatis, Michael N.
    [J]. INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2006, 16 (07): : 1929 - 1950