Local Stochastic Gradient Descent Ascent: Convergence Analysis and Communication Efficiency

被引:0
|
作者
Deng, Yuyang [1 ]
Mandavi, Mehrdad [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
VARIATIONAL-INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local SGD is a promising approach to overcome the communication overhead in distributed learning by reducing the synchronization frequency among worker nodes. Despite the recent theoretical advances of local SGD in empirical risk minimization, the efficiency of its counterpart in minimax optimization remains unexplored. Motivated by large scale minimax learning problems, such as adversarial robust learning and training generative adversarial networks (GANs), we propose local Stochastic Gradient Descent Ascent (local SGDA), where the primal and dual variables can be trained locally and averaged periodically to significantly reduce the number of communications. We show that local SGDA can provably optimize distributed minimax problems in both homogeneous and heterogeneous data with reduced number of communications and establish convergence rates under strongly-convex-strongly-concave and nonconvex-strongly-concave settings. In addition, we propose a novel variant local SGDA+, to solve nonconvex-nonconcave problems. We give corroborating empirical evidence on different distributed minimax problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] The Convergence of Stochastic Gradient Descent in Asynchronous Shared Memory
    Alistarh, Dan
    De Sa, Christopher
    Konstantinov, Nikola
    PODC'18: PROCEEDINGS OF THE 2018 ACM SYMPOSIUM ON PRINCIPLES OF DISTRIBUTED COMPUTING, 2018, : 169 - 177
  • [32] On the Convergence of Decentralized Stochastic Gradient Descent With Biased Gradients
    Jiang, Yiming
    Kang, Helei
    Liu, Jinlan
    Xu, Dongpo
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2025, 73 : 549 - 558
  • [33] CONVERGENCE OF RIEMANNIAN STOCHASTIC GRADIENT DESCENT ON HADAMARD MANIFOLD
    Sakai, Hiroyuki
    Iiduka, Hideaki
    PACIFIC JOURNAL OF OPTIMIZATION, 2024, 20 (04): : 743 - 767
  • [34] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
  • [35] Convergence behavior of diffusion stochastic gradient descent algorithm
    Barani, Fatemeh
    Savadi, Abdorreza
    Yazdi, Hadi Sadoghi
    SIGNAL PROCESSING, 2021, 183
  • [36] Convergence of Momentum-Based Stochastic Gradient Descent
    Jin, Ruinan
    He, Xingkang
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 779 - 784
  • [37] Understanding and Detecting Convergence for Stochastic Gradient Descent with Momentum
    Chee, Jerry
    Li, Ping
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 133 - 140
  • [38] Practical Efficiency of Asynchronous Stochastic Gradient Descent
    Bhardwaj, Onkar
    Cong, Guojing
    PROCEEDINGS OF 2016 2ND WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC), 2016, : 56 - 62
  • [39] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun ZHOU
    Cong-ying HAN
    Tian-de GUO
    ActaMathematicaeApplicataeSinica, 2021, 37 (01) : 126 - 136
  • [40] Local gain adaptation in stochastic gradient descent
    Schraudolph, NN
    NINTH INTERNATIONAL CONFERENCE ON ARTIFICIAL NEURAL NETWORKS (ICANN99), VOLS 1 AND 2, 1999, (470): : 569 - 574