Stochastic Gradient Descent with Polyak's Learning Rate

被引:13
|
作者
Prazeres, Mariana [1 ]
Oberman, Adam M. [1 ]
机构
[1] McGill Univ, Dept Math & Stat, Montreal, PQ, Canada
关键词
Stochastic gradient descent; Learning rate; Polyak's learning rate; Optimization; Strong convexity;
D O I
10.1007/s10915-021-01628-3
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
Stochastic gradient descent (SGD) for strongly convex functions converges at the rateO(1/k). However, achieving good results in practice requires tuning the parameters (for example the learning rate) of the algorithm. In this paper we propose a generalization of the Polyak step size, used for subgradient methods, to stochastic gradient descent. We prove a non-asymptotic convergence at the rateO(1/k) with a rate constant which can be better than the corresponding rate constant for optimally scheduled SGD. We demonstrate that the method is effective in practice, and on convex optimization problems and on training deep neural networks, and compare to the theoretical rate.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Stochastic Gradient Descent with Polyak’s Learning Rate
    Mariana Prazeres
    Adam M. Oberman
    [J]. Journal of Scientific Computing, 2021, 89
  • [2] Stochastic Gradient Descent with Preconditioned Polyak Step-Size
    Abdukhakimov, F.
    Xiang, C.
    Kamzolov, D.
    Takac, M.
    [J]. COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 2024, 64 (04) : 621 - 634
  • [3] Convergence diagnostics for stochastic gradient descent with constant learning rate
    Chee, Jerry
    Toulis, Panos
    [J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [4] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
    Liu, Kangqiao
    Liu Ziyin
    Ueda, Masahito
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [5] Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
    Nacson, Mor Shpigel
    Srebro, Nathan
    Soudry, Daniel
    [J]. 22ND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 89, 2019, 89
  • [6] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
    Duk-Sun Shim
    Joseph Shim
    [J]. International Journal of Control, Automation and Systems, 2023, 21 : 3825 - 3831
  • [7] A Modified Stochastic Gradient Descent Optimization Algorithm With Random Learning Rate for Machine Learning and Deep Learning
    Shim, Duk-Sun
    Shim, Joseph
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2023, 21 (11) : 3825 - 3831
  • [8] Incremental PID Controller-Based Learning Rate Scheduler for Stochastic Gradient Descent
    Wang, Zenghui
    Zhang, Jun
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (05) : 7060 - 7071
  • [9] Stochastic Gradient Descent and Its Variants in Machine Learning
    Netrapalli, Praneeth
    [J]. JOURNAL OF THE INDIAN INSTITUTE OF SCIENCE, 2019, 99 (02) : 201 - 213
  • [10] Towards Learning Stochastic Population Models by Gradient Descent
    Kreikemeyer, Justin N.
    Andelfinger, Philipp
    Uhrmacher, Adelinde M.
    [J]. PROCEEDINGS OF THE 38TH ACM SIGSIM INTERNATIONAL CONFERENCE ON PRINCIPLES OF ADVANCED DISCRETE SIMULATION, ACM SIGSIM-PADS 2024, 2024, : 88 - 92