An Optimal Multistage Stochastic Gradient Method for Minimax Problems

被引:0
|
作者
Fallah, Alireza [1 ]
Ozdaglar, Asuman [1 ]
Pattathil, Sarath [1 ]
机构
[1] MIT, Dept Elect Engn & Comp Sci, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
APPROXIMATION ALGORITHMS; COMPOSITE OPTIMIZATION; EXTRAGRADIENT METHOD;
D O I
10.1109/cdc42340.2020.9304033
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the convergence rate and the size of this neighborhood. Next, we propose a multistage variant of stochastic GDA (M-GDA) that runs in multiple stages with a particular learning rate decay schedule and converges to the exact solution of the minimax problem. We show M-GDA achieves the lower bounds in terms of noise dependence without any assumptions on the knowledge of noise characteristics. We also show that M-GDA obtains a linear decay rate with respect to the error's dependence on the initial error, although the dependence on condition number is suboptimal. In order to improve this dependence, we apply the multistage machinery to the stochastic Optimistic Gradient Descent Ascent (OGDA) algorithm and propose the M-OGDA algorithm which also achieves the optimal linear decay rate with respect to the initial error. To the best of our knowledge, this method is the first to simultaneously achieve the best dependence on noise characteristic as well as the initial error and condition number.
引用
收藏
页码:3573 / 3579
页数:7
相关论文
共 50 条
  • [1] A Universally Optimal Multistage Accelerated Stochastic Gradient Method
    Aybat, Necdet Serhat
    Fallah, Alireza
    Gurbuzbalaban, Mert
    Ozdaglar, Asuman
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [2] THE METHOD OF GENERALIZED STOCHASTIC GRADIENT FOR SOLVING MINIMAX PROBLEMS WITH CONSTRAINED VARIABLES
    ZARIYEV, SK
    PEREVOZCHIKOV, AG
    [J]. USSR COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 1990, 30 (02): : 98 - 105
  • [3] MULTISTAGE STOCHASTIC OPTIMAL CONTROL PROBLEMS
    BERKOVICH, YM
    [J]. ENGINEERING CYBERNETICS, 1975, 13 (01): : 8 - 15
  • [4] Stability and Generalization of Stochastic Gradient Methods for Minimax Problems
    Lei, Yunwen
    Yang, Zhenhuan
    Yang, Tianbao
    Ying, Yiming
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [5] ON MULTISTAGE STOCHASTIC OPTIMAL CONTROL PROBLEMS.
    Berkovich, Ye.M.
    [J]. Engineering Cybernetics (English translation of Tekhnicheskaya Kibernetika), 1975, 13 (01): : 8 - 15
  • [6] AN EFFICIENT GRADIENT PROJECTION METHOD FOR STOCHASTIC OPTIMAL CONTROL PROBLEMS
    Gong, Bo
    Liu, Wenbin
    Tang, Tao
    Zhao, Weidong
    Zhou, Tao
    [J]. SIAM JOURNAL ON NUMERICAL ANALYSIS, 2017, 55 (06) : 2982 - 3005
  • [7] An ε-generalized gradient projection method for nonlinear minimax problems
    Guo-Dong Ma
    Jin-Bao Jian
    [J]. Nonlinear Dynamics, 2014, 75 : 693 - 700
  • [8] An ε-generalized gradient projection method for nonlinear minimax problems
    Ma, Guo-Dong
    Jian, Jin-Bao
    [J]. NONLINEAR DYNAMICS, 2014, 75 (04) : 693 - 700
  • [9] Minimax and risk averse multistage stochastic programming
    Shapiro, Alexander
    [J]. EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 219 (03) : 719 - 726
  • [10] Combining stochastic programming and optimal control to decompose multistage stochastic optimization problems
    Diana Barro
    Elio Canestrelli
    [J]. OR Spectrum, 2016, 38 : 711 - 742