Stochastic Gradient Methods with Preconditioned Updates

被引:0
|
作者
Abdurakhmon Sadiev
Aleksandr Beznosikov
Abdulla Jasem Almansoori
Dmitry Kamzolov
Rachael Tappenden
Martin Takáč
机构
[1] Ivannikov Institute for System Programming of the Russian Academy of Sciences (ISP RAS),
[2] Moscow Institute of Physics and Technology (MIPT),undefined
[3] Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI),undefined
[4] University of Canterbury,undefined
关键词
Optimization; Non-convex optimization; Stochastic optimization; Scaled methods; Variance reduction;
D O I
暂无
中图分类号
学科分类号
摘要
This work considers the non-convex finite-sum minimization problem. There are several algorithms for such problems, but existing methods often work poorly when the problem is badly scaled and/or ill-conditioned, and a primary goal of this work is to introduce methods that alleviate this issue. Thus, here we include a preconditioner based on Hutchinson’s approach to approximating the diagonal of the Hessian and couple it with several gradient-based methods to give new ‘scaled’ algorithms: Scaled SARAH and Scaled L-SVRG. Theoretical complexity guarantees under smoothness assumptions are presented. We prove linear convergence when both smoothness and the PL-condition are assumed. Our adaptively scaled methods use approximate partial second-order curvature information and, therefore, can better mitigate the impact of badly scaled problems. This improved practical performance is demonstrated in the numerical experiments also presented in this work.
引用
收藏
页码:471 / 489
页数:18
相关论文
共 50 条
  • [21] VECTORIZATION OF SOME BLOCK PRECONDITIONED CONJUGATE-GRADIENT METHODS
    BRUGNANO, L
    MARRONE, M
    [J]. PARALLEL COMPUTING, 1990, 14 (02) : 191 - 198
  • [22] Implicit and adaptive inverse preconditioned gradient methods for nonlinear problems
    Chehab, JP
    Raydan, M
    [J]. APPLIED NUMERICAL MATHEMATICS, 2005, 55 (01) : 32 - 47
  • [23] Restrictively preconditioned conjugate gradient methods for systems of linear equations
    Bai, ZZ
    Li, GQ
    [J]. IMA JOURNAL OF NUMERICAL ANALYSIS, 2003, 23 (04) : 561 - 580
  • [24] A Tight Convergence Analysis for Stochastic Gradient Descent with Delayed Updates
    Arjevani, Yossi
    Shamir, Ohad
    Srebro, Nathan
    [J]. ALGORITHMIC LEARNING THEORY, VOL 117, 2020, 117 : 111 - 132
  • [25] Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss
    Mussmann, Stephen
    Liang, Percy
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [26] ORDERING METHODS FOR PRECONDITIONED CONJUGATE-GRADIENT METHODS APPLIED TO UNSTRUCTURED GRID PROBLEMS
    DAZEVEDO, EF
    FORSYTH, PA
    TANG, WP
    [J]. SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS, 1992, 13 (03) : 944 - 961
  • [27] Stochastic Gradient Methods for Stochastic Model Predictive Control
    Themelis, Andreas
    Villa, Silvia
    Patrinos, Panagiotis
    Bemporad, Alberto
    [J]. 2016 EUROPEAN CONTROL CONFERENCE (ECC), 2016, : 154 - 159
  • [28] Extending and Evaluating Fault-Tolerant Preconditioned Conjugate Gradient Methods
    Pachajoa, Carlos
    Levonyak, Markus
    Gansterer, Wilfried N.
    [J]. PROCEEDINGS OF FTXS 2018: IEEE/ACM 8TH WORKSHOP ON FAULT TOLERANCE FOR HPC AT EXTREME SCALE (FTXS), 2018, : 49 - 58
  • [29] PRECONDITIONED CONJUGATE-GRADIENT METHODS FOR THE NAVIER-STOKES EQUATIONS
    AJMANI, K
    NG, WF
    LIOU, MS
    [J]. JOURNAL OF COMPUTATIONAL PHYSICS, 1994, 110 (01) : 68 - 81
  • [30] AERODYNAMIC SHAPE OPTIMIZATION USING PRECONDITIONED CONJUGATE-GRADIENT METHODS
    BURGREEN, GW
    BAYSAL, O
    [J]. AIAA JOURNAL, 1994, 32 (11) : 2145 - 2152