Stochastic Gradient Descent with Polyak's Learning Rate

被引：13

作者：

Prazeres, Mariana ^{[1
]}

Oberman, Adam M. ^{[1
]}

机构：

[1] McGill Univ, Dept Math & Stat, Montreal, PQ, Canada

来源：

JOURNAL OF SCIENTIFIC COMPUTING | 2021年 / 89卷 / 01期

关键词：

Stochastic gradient descent; Learning rate; Polyak's learning rate; Optimization; Strong convexity;

D O I：

10.1007/s10915-021-01628-3

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

Stochastic gradient descent (SGD) for strongly convex functions converges at the rateO(1/k). However, achieving good results in practice requires tuning the parameters (for example the learning rate) of the algorithm. In this paper we propose a generalization of the Polyak step size, used for subgradient methods, to stochastic gradient descent. We prove a non-asymptotic convergence at the rateO(1/k) with a rate constant which can be better than the corresponding rate constant for optimally scheduled SGD. We demonstrate that the method is effective in practice, and on convex optimization problems and on training deep neural networks, and compare to the theoretical rate.

引用

页数：16

共 50 条

[1] Stochastic Gradient Descent with Polyak’s Learning Rate
Mariana Prazeres
Adam M. Oberman
[J]. Journal of Scientific Computing, 2021, 89
[2] Stochastic Gradient Descent with Preconditioned Polyak Step-Size
Abdukhakimov, F.
Xiang, C.
Kamzolov, D.
Takac, M.
[J]. COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 2024, 64 (04) : 621 - 634
[3] Convergence diagnostics for stochastic gradient descent with constant learning rate
Chee, Jerry
Toulis, Panos
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[4] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Liu, Kangqiao
Liu Ziyin
Ueda, Masahito
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[5] Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Nacson, Mor Shpigel
Srebro, Nathan
Soudry, Daniel
[J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
[6] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
Duk-Sun Shim
Joseph Shim
[J]. International Journal of Control, Automation and Systems, 2023, 21 : 3825 - 3831
[7] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
Shim, Duk-Sun
Shim, Joseph
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (11) : 3825 - 3831
[8] Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent
Wang, Zenghui
Zhang, Jun
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7060 - 7071
[9] Stochastic Gradient Descent and Its Variants in Machine Learning
Netrapalli, Praneeth
[J]. JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
[10] Towards Learning Stochastic Population Models by Gradient Descent
Kreikemeyer, Justin N.
Andelfinger, Philipp
Uhrmacher, Adelinde M.
[J]. PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 88 - 92

← 1 2 3 4 5 →