ON THE CONVERGENCE OF MIRROR DESCENT BEYOND STOCHASTIC CONVEX PROGRAMMING

被引:15
|
作者
Zhou, Zhengyuan [1 ,2 ]
Mertikopoulos, Panayotis [3 ]
Bambos, Nicholas [4 ]
Boyd, Stephen P. [4 ]
Glynn, Peter W. [4 ]
机构
[1] IBM Res, Yorktown Hts, NY 10598 USA
[2] NYU, Stern Sch Business, 550 1St Ave, New York, NY 10012 USA
[3] Univ Grenoble Alpes, CNRS, Grenoble INP, Inria,LIG, F-38000 Grenoble, France
[4] Stanford Univ, Dept Elect Engn, Dept Management Sci & Engn, Stanford, CA 94305 USA
关键词
mirror descent; nonconvex programming; stochastic optimization; stochastic approximation; variational coherence; EXPONENTIATED GRADIENT; ALGORITHM;
D O I
10.1137/17M1134925
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
In this paper, we examine the convergence of mirror descent in a class of stochastic optimization problems that are not necessarily convex (or even quasi-convex) and which we call variationally coherent. Since the standard technique of "ergodic averaging" offers no tangible benefits beyond convex programming, we focus directly on the algorithm's last generated sample (its "last iterate"), and we show that it converges with probabiility 1 if the underlying problem is coherent. We further consider a localized version of variational coherence which ensures local convergence of Stochastic mirror descent (SMD) with high probability. These results contribute to the landscape of nonconvex stochastic optimization by showing that (quasi-)convexity is not essential for convergence to a global minimum: rather, variational coherence, a much weaker requirement, suffices. Finally, building on the above, we reveal an interesting insight regarding the convergence speed of SMD: in problems with sharp minima (such as generic linear programs or concave minimization problems), SMD reaches a minimum point in a finite number of steps (a.s.), even in the presence of persistent gradient noise. This result is to be contrasted with existing black-box convergence rate estimates that are only asymptotic.
引用
收藏
页码:687 / 716
页数:30
相关论文
共 50 条
  • [31] Taming Convergence for Asynchronous Stochastic Gradient Descent with Unbounded Delay in Non-Convex Learning
    Zhang, Xin
    Liu, Jia
    Zhu, Zhengyuan
    [J]. 2020 59TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2020, : 3580 - 3585
  • [32] Convergence of Mirror Descent Dynamics in the Routing Game
    Krichene, Walid
    Krichene, Syrine
    Bayen, Alexandre
    [J]. 2015 EUROPEAN CONTROL CONFERENCE (ECC), 2015, : 569 - 574
  • [33] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48
  • [34] Convex approximations in stochastic programming by semidefinite programming
    István Deák
    Imre Pólik
    András Prékopa
    Tamás Terlaky
    [J]. Annals of Operations Research, 2012, 200 : 171 - 182
  • [35] Convex approximations in stochastic programming by semidefinite programming
    Deak, Istvan
    Polik, Imre
    Prekopa, Andras
    Terlaky, Tamas
    [J]. ANNALS OF OPERATIONS RESEARCH, 2012, 200 (01) : 171 - 182
  • [36] Fastest rates for stochastic mirror descent methods
    Hanzely, Filip
    Richtarik, Peter
    [J]. COMPUTATIONAL OPTIMIZATION AND APPLICATIONS, 2021, 79 (03) : 717 - 766
  • [37] Stochastic Mirror Descent on Overparameterized Nonlinear Models
    Azizan, Navid
    Lale, Sahin
    Hassibi, Babak
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) : 7717 - 7727
  • [38] Fastest rates for stochastic mirror descent methods
    Filip Hanzely
    Peter Richtárik
    [J]. Computational Optimization and Applications, 2021, 79 : 717 - 766
  • [39] STOCHASTIC BLOCK MIRROR DESCENT METHODS FOR NONSMOOTH AND STOCHASTIC OPTIMIZATION
    Dang, Cong D.
    Lan, Guanghui
    [J]. SIAM JOURNAL ON OPTIMIZATION, 2015, 25 (02) : 856 - 881
  • [40] Adaptive Stochastic Mirror Descent for Constrained Optimization
    Bayandina, Anastasia
    [J]. 2017 CONSTRUCTIVE NONSMOOTH ANALYSIS AND RELATED TOPICS (DEDICATED TO THE MEMORY OF V.F. DEMYANOV) (CNSA), 2017, : 40 - 43