Local Stochastic Gradient Descent Ascent: Convergence Analysis and Communication Efficiency

被引:0
|
作者
Deng, Yuyang [1 ]
Mandavi, Mehrdad [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
VARIATIONAL-INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local SGD is a promising approach to overcome the communication overhead in distributed learning by reducing the synchronization frequency among worker nodes. Despite the recent theoretical advances of local SGD in empirical risk minimization, the efficiency of its counterpart in minimax optimization remains unexplored. Motivated by large scale minimax learning problems, such as adversarial robust learning and training generative adversarial networks (GANs), we propose local Stochastic Gradient Descent Ascent (local SGDA), where the primal and dual variables can be trained locally and averaged periodically to significantly reduce the number of communications. We show that local SGDA can provably optimize distributed minimax problems in both homogeneous and heterogeneous data with reduced number of communications and establish convergence rates under strongly-convex-strongly-concave and nonconvex-strongly-concave settings. In addition, we propose a novel variant local SGDA+, to solve nonconvex-nonconcave problems. We give corroborating empirical evidence on different distributed minimax problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods
    Beznosikov, Aleksandr
    Gorbunov, Eduard
    Berard, Hugo
    Loizou, Nicolas
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 206, 2023, 206 : 172 - 235
  • [42] Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems
    Luo, Luo
    Ye, Haishan
    Huang, Zhichao
    Zhang, Tong
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [43] Distributed Stochastic Gradient Descent With Compressed and Skipped Communication
    Phuong, Tran Thi
    Phong, Le Trieu
    Fukushima, Kazuhide
    IEEE ACCESS, 2023, 11 : 99836 - 99846
  • [44] Communication-Censored Distributed Stochastic Gradient Descent
    Li, Weiyu
    Wu, Zhaoxian
    Chen, Tianyi
    Li, Liping
    Ling, Qing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6831 - 6843
  • [45] Convergence diagnostics for stochastic gradient descent with constant learning rate
    Chee, Jerry
    Toulis, Panos
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [46] A simplified convergence theory for Byzantine resilient stochastic gradient descent
    Roberts, Lindon
    Smyth, Edward
    EURO JOURNAL ON COMPUTATIONAL OPTIMIZATION, 2022, 10
  • [47] Convergence in High Probability of Distributed Stochastic Gradient Descent Algorithms
    Lu, Kaihong
    Wang, Hongxia
    Zhang, Huanshui
    Wang, Long
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (04) : 2189 - 2204
  • [48] Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
    Mitra, Partha P.
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1890 - 1894
  • [49] Convergence Analysis of Gradient Descent for Eigenvector Computation
    Xu, Zhiqiang
    Cao, Xin
    Gao, Xin
    PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 2933 - 2939
  • [50] Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max Optimization
    Yan, Yan
    Xu, Yi
    Lin, Qihang
    Liu, Wei
    Yang, Tianbao
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33