Local Stochastic Gradient Descent Ascent: Convergence Analysis and Communication Efficiency

被引:0
|
作者
Deng, Yuyang [1 ]
Mandavi, Mehrdad [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
VARIATIONAL-INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local SGD is a promising approach to overcome the communication overhead in distributed learning by reducing the synchronization frequency among worker nodes. Despite the recent theoretical advances of local SGD in empirical risk minimization, the efficiency of its counterpart in minimax optimization remains unexplored. Motivated by large scale minimax learning problems, such as adversarial robust learning and training generative adversarial networks (GANs), we propose local Stochastic Gradient Descent Ascent (local SGDA), where the primal and dual variables can be trained locally and averaged periodically to significantly reduce the number of communications. We show that local SGDA can provably optimize distributed minimax problems in both homogeneous and heterogeneous data with reduced number of communications and establish convergence rates under strongly-convex-strongly-concave and nonconvex-strongly-concave settings. In addition, we propose a novel variant local SGDA+, to solve nonconvex-nonconcave problems. We give corroborating empirical evidence on different distributed minimax problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] On Convergence of Gradient Descent Ascent: A Tight Local Analysis
    Li, Haochuan
    Farnia, Farzan
    Das, Subhro
    Jadbabaie, Ali
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [2] On the Convergence of Stochastic Compositional Gradient Descent Ascent Method
    Gao, Hongchang
    Wang, Xiaoqian
    Luo, Lei
    Shi, Xinghua
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2389 - 2395
  • [3] Randomized Stochastic Gradient Descent Ascent
    Sebbouh, Othmane
    Cuturi, Marco
    Peyre, Gabriel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [4] Communication-Efficient Stochastic Gradient Descent Ascent with Momentum Algorithms
    Zhang, Yihan
    Qiu, Meikang
    Gao, Hongchang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 4602 - 4610
  • [5] On the Convergence of Local Stochastic Compositional Gradient Descent with Momentum
    Gao, Hongchang
    Li, Junyi
    Huang, Heng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [6] Convergence analysis of gradient descent stochastic algorithms
    Shapiro, A
    Wardi, Y
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1996, 91 (02) : 439 - 454
  • [7] Local Convergence of Gradient Descent-Ascent for Training Generative Adversarial Networks
    Becker, Evan
    Pandit, Parthe
    Rangan, Sundeep
    Fletcher, Alyson K.
    FIFTY-SEVENTH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS & COMPUTERS, IEEECONF, 2023, : 892 - 896
  • [8] Convergence analysis of distributed stochastic gradient descent with shuffling
    Meng, Qi
    Chen, Wei
    Wang, Yue
    Ma, Zhi-Ming
    Liu, Tie-Yan
    NEUROCOMPUTING, 2019, 337 : 46 - 57
  • [9] Distributed Stochastic Gradient Descent: Nonconvexity, Nonsmoothness, and Convergence to Local Minima
    Swenson, Brian
    Murray, Ryan
    Poor, H. Vincent
    Kar, Soummya
    JOURNAL OF MACHINE LEARNING RESEARCH, 2022, 23
  • [10] Convergence of Stochastic Gradient Descent for PCA
    Shamir, Ohad
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 48, 2016, 48