Appropriate Learning Rates of Adaptive Learning Rate Optimization Algorithms for Training Deep Neural Networks

被引：31

作者：

Iiduka, Hideaki ^{[1
]}

机构：

[1] Meiji Univ, Dept Comp Sci, Tokyo, Kanagawa 2148571, Japan

来源：

IEEE TRANSACTIONS ON CYBERNETICS | 2022年 / 52卷 / 12期

基金：

日本学术振兴会;

关键词：

Optimization; Convergence; Stochastic processes; Deep learning; Approximation algorithms; Training; Heuristic algorithms; Adaptive mean square gradient (AMSGrad); adaptive moment estimation (Adam); adaptive-learning-rate optimization algorithm; deep neural network; learning rate; nonconvex stochastic optimization; SUBGRADIENT METHODS;

D O I：

10.1109/TCYB.2021.3107415

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article deals with nonconvex stochastic optimization problems in deep learning. Appropriate learning rates, based on theory, for adaptive-learning-rate optimization algorithms (e.g., Adam and AMSGrad) to approximate the stationary points of such problems are provided. These rates are shown to allow faster convergence than previously reported for these algorithms. Specifically, the algorithms are examined in numerical experiments on text and image classification and are shown in experiments to perform better with constant learning rates than algorithms using diminishing learning rates.

引用

页码：13250 / 13261

页数：12

共 50 条

[31] A novel approach for implementation of adaptive learning rate neural networks
Rezaie, MG
Farbiz, F
Moghaddam, EZ
Hooshmand, A
[J]. 22ND NORCHIP CONFERENCE, PROCEEDINGS, 2004, : 79 - 82
[32] An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks
Kang Wang
Tao Sun
Yong Dou
[J]. Neural Processing Letters, 2022, 54 : 803 - 816
[33] An Adaptive Learning Rate Schedule for SIGNSGD Optimizer in Neural Networks
Wang, Kang
Sun, Tao
Dou, Yong
[J]. NEURAL PROCESSING LETTERS, 2022, 54 (02) : 803 - 816
[34] Effective neural network training with adaptive learning rate based on training loss
Takase, Tomoumi
Oyama, Satoshi
Kurihara, Masahito
[J]. NEURAL NETWORKS, 2018, 101 : 68 - 78
[35] Dynamic random distribution learning rate for neural networks training
Hu, Xueheng
Wen, Shuhuan
Lam, H. K.
[J]. APPLIED SOFT COMPUTING, 2022, 124
[36] Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review
Kaveh, Mehrdad
Mesgari, Mohammad Saadi
[J]. NEURAL PROCESSING LETTERS, 2023, 55 (04) : 4519 - 4622
[37] Application of Meta-Heuristic Algorithms for Training Neural Networks and Deep Learning Architectures: A Comprehensive Review
Mehrdad Kaveh
Mohammad Saadi Mesgari
[J]. Neural Processing Letters, 2023, 55 : 4519 - 4622
[38] Unsaturated MLP Neural Networks Training Algorithm using a Piecewise Error Function and Adaptive Learning Rates
Moallem, Payman
Ayoughi, S. Arvin
[J]. 2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 55 - 60
[39] DYNAMIC SECURITY RULE OPTIMIZATION BASED ON DEEP LEARNING AND ADAPTIVE ALGORITHMS
HANG, FEILU
XIE, LINJIANG
ZHANG, ZHENHONG
LIU, YUTING
HU, J.I.A.N.
[J]. Scalable Computing, 2024, 25 (04): : 2603 - 2613
[40] Adaptive algorithms for neural network supervised learning: A deterministic optimization approach
Magoulas, George D.
Vrahatis, Michael N.
[J]. INTERNATIONAL JOURNAL OF BIFURCATION AND CHAOS, 2006, 16 (07): : 1929 - 1950

← 1 2 3 4 5 →