Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization

被引:0
|
作者
Zhang, Guodong [1 ]
Wang, Yuanhao [2 ]
Lessard, Laurent [3 ]
Grosse, Roger [1 ]
机构
[1] Univ Toronto, Toronto, ON, Canada
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Northeastern Univ, Boston, MA 02115 USA
关键词
VARIATIONAL INEQUALITY; ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Smooth minimax games often proceed by simultaneous or alternating gradient updates. Although algorithms with alternating updates are commonly used in practice, the majority of existing theoretical analyses focus on simultaneous algorithms for convenience of analysis. In this paper, we study alternating gradient descent-ascent (Alt-GDA) in minimax games and show that Alt-GDA is superior to its simultaneous counterpart (SimGDA) in many settings. We prove that AltGDA achieves a near-optimal local convergence rate for strongly convex-strongly concave (SCSC) problems while Sim-GDA converges at a much slower rate. To our knowledge, this is the first result of any setting showing that Alt-GDA converges faster than Sim-GDA by more than a constant. We further adapt the theory of integral quadratic constraints (IQC) and show that Alt-GDA attains the same rate globally for a subclass of SCSC minimax problems. Empirically, we demonstrate that alternating updates speed up GAN training significantly and the use of optimism only helps for simultaneous algorithms.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] In search of near-optimal optimization phase orderings
    Kulkarni, Prasad A.
    Whalley, David B.
    Tyson, Gary S.
    Davidson, Jack W.
    ACM SIGPLAN NOTICES, 2006, 41 (07) : 83 - 92
  • [32] Alternating Gradient Descent Ascent for Nonconvex Min-Max Problems in Robust Learning and GANs
    Lu, Songtao
    Singh, Rahul
    Chen, Xiangyi
    Chen, Yongxin
    Hong, Mingyi
    CONFERENCE RECORD OF THE 2019 FIFTY-THIRD ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, 2019, : 680 - 684
  • [33] Near-optimal control of altitude and path angle during aerospace plane ascent
    Kremer, JP
    Mease, KD
    JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1997, 20 (04) : 789 - 796
  • [34] A Near-Optimal (Minimax) Tree-Structured Partition for Mutual Information Estimation
    Silva, Jorge
    Narayanan, Shrikanth S.
    2010 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY, 2010, : 1418 - 1422
  • [35] ON THE GLOBAL CONVERGENCE OF RANDOMIZED COORDINATE GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION
    Chen, Ziang
    Li, Yingzhou
    Lu, Jianfeng
    SIAM JOURNAL ON OPTIMIZATION, 2023, 33 (02) : 713 - 738
  • [36] Alternating and Parallel Proximal Gradient Methods for Nonsmooth, Nonconvex Minimax: A Unified Convergence Analysis
    Cohen, Eyal
    Teboulle, Marc
    MATHEMATICS OF OPERATIONS RESEARCH, 2025, 50 (01) : 1 - 28
  • [37] A Hierarchy of Near-Optimal Policies for Multistage Adaptive Optimization
    Bertsimas, Dimitris
    Iancu, Dan Andrei
    Parrilo, Pablo A.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2011, 56 (12) : 2803 - 2818
  • [38] Near-Optimal Graph Signal Sampling by Pareto Optimization
    Luo, Dongqi
    Si, Binqiang
    Zhang, Saite
    Yu, Fan
    Zhu, Jihong
    SENSORS, 2021, 21 (04) : 1 - 13
  • [39] Near-optimal method for highly smooth convex optimization
    Bubeck, Sebastien
    Jiang, Qijia
    Lee, Yin Tat
    Li, Yuanzhi
    Sidford, Aaron
    CONFERENCE ON LEARNING THEORY, VOL 99, 2019, 99
  • [40] Solving Min-Max Optimization with Hidden Structure via Gradient Descent Ascent
    Flokas, Lampros
    Vlatakis-Gkaragkounis, Emmanouil V.
    Piliouras, Georgios
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021,