Accelerate Distributed Stochastic Descent for Nonconvex Optimization with Momentum

被引:1
|
作者
Cong, Guojing [1 ]
Liu, Tianyi [2 ]
机构
[1] IBM TJ Watson Res Ctr, Ossining, NY 10562 USA
[2] Georgia Inst Technol, Atlanta, GA 30332 USA
关键词
D O I
10.1109/MLHPCAI4S51975.2020.00011
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Momentum method has been used extensively in optimizers for deep learning. Recent studies show that distributed training through K-step averaging has many nice properties. We propose a momentum method for such model averaging approaches. At each individual learner level traditional stochastic gradient is applied. At the meta-level (global learner level), one momentum term is applied and we call it block momentum. We analyze the convergence and scaling properties of such momentum methods. Our experimental results show that block momentum not only accelerates training, but also achieves better results.
引用
收藏
页码:29 / 39
页数:11
相关论文
共 50 条
  • [1] Distributed Stochastic Consensus Optimization With Momentum for Nonconvex Nonsmooth Problems
    Wang, Zhiguo
    Zhang, Jiawei
    Chang, Tsung-Hui
    Li, Jian
    Luo, Zhi-Quan
    IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2021, 69 : 4486 - 4501
  • [2] Distributed stochastic nonsmooth nonconvex optimization
    Kungurtsev, Vyacheslav
    OPERATIONS RESEARCH LETTERS, 2022, 50 (06) : 627 - 631
  • [3] Quantized Gradient Descent Algorithm for Distributed Nonconvex Optimization
    Yoshida, Junya
    Hayashi, Naoki
    Takai, Shigemasa
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 2023, E106A (10) : 1297 - 1304
  • [4] ON DISTRIBUTED STOCHASTIC GRADIENT DESCENT FOR NONCONVEX FUNCTIONS IN THE PRESENCE OF BYZANTINES
    Bulusu, Saikiran
    Khanduri, Prashant
    Sharma, Pranay
    Varshney, Pramod K.
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3137 - 3141
  • [5] Second-Order Guarantees of Stochastic Gradient Descent in Nonconvex Optimization
    Vlaski, Stefan
    Sayed, Ali H.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (12) : 6489 - 6504
  • [6] pbSGD: Powered Stochastic Gradient Descent Methods for Accelerated Nonconvex Optimization
    Zhou, Beitong
    Liu, Jun
    Sun, Weigao
    Chen, Ruijuan
    Tomlin, Claire
    Yuan, Ye
    PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 3258 - 3266
  • [7] ZEROTH-ORDER STOCHASTIC PROJECTED GRADIENT DESCENT FOR NONCONVEX OPTIMIZATION
    Liu, Sijia
    Li, Xingguo
    Chen, Pin-Yu
    Haupt, Jarvis
    Amini, Lisa
    2018 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2018), 2018, : 1179 - 1183
  • [8] The combination of particle swarm optimization and stochastic gradient descent with momentum
    Chen, Chi-Hua
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2022, 18 : 132 - 132
  • [9] Proximal stochastic recursive momentum algorithm for nonsmooth nonconvex optimization problems
    Wang, Zhaoxin
    Wen, Bo
    OPTIMIZATION, 2024, 73 (02) : 481 - 495
  • [10] Zeroth-order algorithms for stochastic distributed nonconvex optimization
    Yi, Xinlei
    Zhang, Shengjun
    Yang, Tao
    Johansson, Karl H.
    AUTOMATICA, 2022, 142