Second-Order Guarantees of Stochastic Gradient Descent in Nonconvex Optimization

被引:5
|
作者
Vlaski, Stefan [1 ,2 ,3 ]
Sayed, Ali H. [1 ]
机构
[1] Ecole Polytech Fed Lausanne, Inst Elect & Micro Engn, CH-1015 Lausanne, Switzerland
[2] Univ Calif Los Angeles, Dept Elect Engn, Los Angeles, CA 90095 USA
[3] Imperial Coll London, Dept Elect & Elect Engn, London SW72BX, England
基金
美国国家科学基金会;
关键词
Convergence; Upper bound; Perturbation methods; Costs; Heuristic algorithms; Cost function; Convex functions; Adaptation; gradient noise; nonconvex cost; stationary points; stochastic optimization;
D O I
10.1109/TAC.2021.3131963
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recent years have seen increased interest in performance guarantees of gradient descent algorithms for nonconvex optimization. A number of works have uncovered that gradient noise plays a critical role in the ability of gradient descent recursions to efficiently escape saddle-points and reach second-order stationary points. Most available works limit the gradient noise component to be bounded with probability one or sub-Gaussian and leverage concentration inequalities to arrive at high-probability results. We present an alternate approach, relying primarily on mean-square arguments and show that a more relaxed relative bound on the gradient noise variance is sufficient to ensure efficient escape from saddle points without the need to inject additional noise, employ alternating step sizes, or rely on a global dispersive noise assumption, as long as a gradient noise component is present in a descent direction for every saddle point.
引用
收藏
页码:6489 / 6504
页数:16
相关论文
共 50 条
  • [1] Second-order guarantees in centralized, federated and decentralized nonconvex optimization
    Vlaski, Stefan
    Sayed, Ali H.
    [J]. COMMUNICATIONS IN INFORMATION AND SYSTEMS, 2020, 20 (03) : 353 - 388
  • [2] ZEROTH-ORDER STOCHASTIC PROJECTED GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION
    Liu, Sijia
    Li, Xingguo
    Chen, Pin-Yu
    Haupt, Jarvis
    Amini, Lisa
    [J]. 2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 1179 - 1183
  • [3] High Probability Guarantees for Nonconvex Stochastic Gradient Descent with Heavy Tails
    Li, Shaojie
    Liu, Yong
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [4] Inexact proximal stochastic second-order methods for nonconvex composite optimization
    Wang, Xiao
    Zhang, Hongchao
    [J]. OPTIMIZATION METHODS & SOFTWARE, 2020, 35 (04): : 808 - 835
  • [5] Second-Order Online Nonconvex Optimization
    Lesage-Landry, Antoine
    Taylor, Joshua A.
    Shames, Iman
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2021, 66 (10) : 4866 - 4872
  • [6] SECOND-ORDER GUARANTEES OF DISTRIBUTED GRADIENT ALGORITHMS
    Daneshmand, Amir
    Scutari, Gesualdo
    Kungurtsev, Vyacheslav
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2020, 30 (04) : 3029 - 3068
  • [7] PA-GD: On the Convergence of Perturbed Alternating Gradient Descent to Second-Order Stationary Points for Structured Nonconvex Optimization
    Lu, Songtao
    Hong, Mingyi
    Wang, Zhengdao
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
  • [8] Gradient Descent on Neurons and its Link to Approximate Second-Order Optimization
    Benzing, Frederik
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [9] Stochastic Gradient Descent Combines Second-Order Information for Training Neural Network
    Chen, Minyu
    [J]. ICOMS 2018: 2018 INTERNATIONAL CONFERENCE ON MATHEMATICS AND STATISTICS, 2018, : 69 - 73
  • [10] Second-order Guarantees of Gradient Algorithms over Networks
    Daneshmand, Amir
    Scutari, Gesualdo
    Kungurtsev, Vyacheslav
    [J]. 2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2018, : 359 - 365