Local Stochastic Gradient Descent Ascent: Convergence Analysis and Communication Efficiency

被引:0
|
作者
Deng, Yuyang [1 ]
Mandavi, Mehrdad [1 ]
机构
[1] Penn State Univ, University Pk, PA 16802 USA
基金
美国国家科学基金会;
关键词
VARIATIONAL-INEQUALITIES;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Local SGD is a promising approach to overcome the communication overhead in distributed learning by reducing the synchronization frequency among worker nodes. Despite the recent theoretical advances of local SGD in empirical risk minimization, the efficiency of its counterpart in minimax optimization remains unexplored. Motivated by large scale minimax learning problems, such as adversarial robust learning and training generative adversarial networks (GANs), we propose local Stochastic Gradient Descent Ascent (local SGDA), where the primal and dual variables can be trained locally and averaged periodically to significantly reduce the number of communications. We show that local SGDA can provably optimize distributed minimax problems in both homogeneous and heterogeneous data with reduced number of communications and establish convergence rates under strongly-convex-strongly-concave and nonconvex-strongly-concave settings. In addition, we propose a novel variant local SGDA+, to solve nonconvex-nonconcave problems. We give corroborating empirical evidence on different distributed minimax problems.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Linear Convergence of Adaptive Stochastic Gradient Descent
    Xie, Yuege
    Wu, Xiaoxia
    Ward, Rachel
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108
  • [22] On the convergence and improvement of stochastic normalized gradient descent
    Shen-Yi Zhao
    Yin-Peng Xie
    Wu-Jun Li
    Science China Information Sciences, 2021, 64
  • [23] On the convergence and improvement of stochastic normalized gradient descent
    Zhao, Shen-Yi
    Xie, Yin-Peng
    Li, Wu-Jun
    SCIENCE CHINA-INFORMATION SCIENCES, 2021, 64 (03)
  • [24] Efficiency Ordering of Stochastic Gradient Descent
    Hu, Jie
    Doshi, Vishwaraj
    Eun, Do Young
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [25] Convergence Analysis of Accelerated Stochastic Gradient Descent Under the Growth Condition
    Chen, You-Lin
    Na, Sen
    Kolar, Mladen
    MATHEMATICS OF OPERATIONS RESEARCH, 2024, 49 (04) : 2492 - 2526
  • [26] Communication-Efficient Local Stochastic Gradient Descent for Scalable Deep Learning
    Lee, Sunwoo
    Kang, Qiao
    Agrawal, Ankit
    Choudhary, Alok
    Liao, Wei-keng
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 718 - 727
  • [27] Convergence of stochastic gradient descent under a local Lojasiewicz condition for deep neural networks
    An, Jing
    Lu, Jianfeng
    arXiv, 2023,
  • [28] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
  • [29] Optimized convergence of stochastic gradient descent by weighted averaging
    Hagedorn, Melinda
    Jarre, Florian
    OPTIMIZATION METHODS & SOFTWARE, 2024, 39 (04): : 699 - 724
  • [30] Fast Convergence Stochastic Parallel Gradient Descent Algorithm
    Hu Dongting
    Shen Wen
    Ma Wenchao
    Liu Xinyu
    Su Zhouping
    Zhu Huaxin
    Zhang Xiumei
    Que Lizhi
    Zhu Zhuowei
    Zhang Yixin
    Chen Guoqing
    Hu Lifa
    LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (12)