Stochastic Gradient Methods with Preconditioned Updates

被引:0
|
作者
Abdurakhmon Sadiev
Aleksandr Beznosikov
Abdulla Jasem Almansoori
Dmitry Kamzolov
Rachael Tappenden
Martin Takáč
机构
[1] Ivannikov Institute for System Programming of the Russian Academy of Sciences (ISP RAS),
[2] Moscow Institute of Physics and Technology (MIPT),undefined
[3] Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI),undefined
[4] University of Canterbury,undefined
关键词
Optimization; Non-convex optimization; Stochastic optimization; Scaled methods; Variance reduction;
D O I
暂无
中图分类号
学科分类号
摘要
This work considers the non-convex finite-sum minimization problem. There are several algorithms for such problems, but existing methods often work poorly when the problem is badly scaled and/or ill-conditioned, and a primary goal of this work is to introduce methods that alleviate this issue. Thus, here we include a preconditioner based on Hutchinson’s approach to approximating the diagonal of the Hessian and couple it with several gradient-based methods to give new ‘scaled’ algorithms: Scaled SARAH and Scaled L-SVRG. Theoretical complexity guarantees under smoothness assumptions are presented. We prove linear convergence when both smoothness and the PL-condition are assumed. Our adaptively scaled methods use approximate partial second-order curvature information and, therefore, can better mitigate the impact of badly scaled problems. This improved practical performance is demonstrated in the numerical experiments also presented in this work.
引用
收藏
页码:471 / 489
页数:18
相关论文
共 50 条
  • [1] Stochastic Gradient Methods with Preconditioned Updates
    Sadiev, Abdurakhmon
    Beznosikov, Aleksandr
    Almansoori, Abdulla Jasem
    Kamzolov, Dmitry
    Tappenden, Rachael
    Takac, Martin
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2024, 201 (02) : 471 - 489
  • [2] Preconditioned Stochastic Gradient Descent
    Li, Xi-Lin
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (05) : 1454 - 1466
  • [3] Constrained and Preconditioned Stochastic Gradient Method
    Jiang, Hong
    Huang, Gang
    Wilford, Paul A.
    Yu, Liangkai
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2015, 63 (10) : 2678 - 2691
  • [4] Stochastic optimization using the stochastic preconditioned conjugate gradient method
    Oakley, DR
    Sues, RH
    [J]. AIAA JOURNAL, 1996, 34 (09) : 1969 - 1971
  • [5] PRECONDITIONED CONJUGATE-GRADIENT METHODS - PREFACE
    AXELSSON, O
    [J]. BIT, 1989, 29 (04): : 577 - 582
  • [6] Stochastic gradient descent with differentially private updates
    Song, Shuang
    Chaudhuri, Kamalika
    Sarwate, Anand D.
    [J]. 2013 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2013, : 245 - 248
  • [7] Preconditioned Stochastic Gradient Langevin Dynamics for Deep Neural Networks
    Li, Chunyuan
    Chen, Changyou
    Carlson, David
    Carin, Lawrence
    [J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 1788 - 1794
  • [8] Stochastic Gradient Descent with Preconditioned Polyak Step-Size
    Abdukhakimov, F.
    Xiang, C.
    Kamzolov, D.
    Takac, M.
    [J]. COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 2024, 64 (04) : 621 - 634
  • [9] Preconditioned Stochastic Gradient Descent Optimisation for Monomodal Image Registration
    Klein, Stefan
    Staring, Marius
    Andersson, Patrik
    Pluim, Josien P. W.
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION (MICCAI 2011), PT II, 2011, 6892 : 549 - +
  • [10] PRECONDITIONED BICONJUGATE GRADIENT METHODS FOR NUMERICAL RESERVOIR SIMULATION
    JOLY, P
    EYMARD, R
    [J]. JOURNAL OF COMPUTATIONAL PHYSICS, 1990, 91 (02) : 298 - 309