MindTheStep-AsyncPSGD: Adaptive Asynchronous Parallel Stochastic Gradient Descent

被引:0
|
作者
Backstrom, Karl [1 ]
Papatriantafilou, Marina [1 ]
Tsigas, Philippas [1 ]
机构
[1] Chalmers Univ Technol, Dept Comp Sci & Engn, Gothenburg, Sweden
关键词
D O I
10.1109/bigdata47090.2019.9006054
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Stochastic Gradient Descent (SGD) is very useful in optimization problems with high-dimensional non-convex target functions, and hence constitutes an important component of several Machine Learning and Data Analytics methods. Recently there have been significant works on understanding the parallelism inherent to SGD, and its convergence properties. Asynchronous, parallel SGD (AsyncPSGD) has received particular attention, due to observed performance benefits. On the other hand, asynchrony implies inherent challenges in understanding the execution of the algorithm and its convergence, stemming from the fact that the contribution of a thread might be based on an old (stale) view of the stale. In this work we aim to deepen the understanding of AsyncPSGD in order to increase the statistical efficiency in the presence of stale gradients. We propose new models for capturing the nature of the staleness distribution in a practical setting. Using the proposed models, we derive a staleness-adaptive SGD framework, MindTheStep-AsyncPSGD, for adapting the step size in an online-fashion, which provably reduces the negative impact of asynchrony. Moreover, we provide general convergence time bounds for a wide class of staleness-adaptive step size strategies for convex target functions. We also provide a detailed empirical study, showing how our approach implies faster convergence for deep learning applications.
引用
收藏
页码:16 / 25
页数:10
相关论文
共 50 条
  • [1] Adaptive wavefront control with asynchronous stochastic parallel gradient descent clusters
    Vorontsov, Mikhail A.
    Carhart, Gary W.
    JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 2006, 23 (10) : 2613 - 2622
  • [2] Asynchronous Decentralized Parallel Stochastic Gradient Descent
    Lian, Xiangru
    Zhang, Wei
    Zhang, Ce
    Liu, Ji
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
  • [3] Parallel and distributed asynchronous adaptive stochastic gradient methods
    Yangyang Xu
    Yibo Xu
    Yonggui Yan
    Colin Sutcher-Shepard
    Leopold Grinberg
    Jie Chen
    Mathematical Programming Computation, 2023, 15 : 471 - 508
  • [4] Parallel and distributed asynchronous adaptive stochastic gradient methods
    Xu, Yangyang
    Xu, Yibo
    Yan, Yonggui
    Sutcher-Shepard, Colin
    Grinberg, Leopold
    Chen, Jie
    MATHEMATICAL PROGRAMMING COMPUTATION, 2023, 15 (03) : 471 - 508
  • [5] Stochastic parallel gradient descent algorithm for adaptive optics system
    Ma H.
    Zhang P.
    Zhang J.
    Fan C.
    Wang Y.
    Qiangjiguang Yu Lizishu/High Power Laser and Particle Beams, 2010, 22 (06): : 1206 - 1210
  • [6] Stochastic modified equations for the asynchronous stochastic gradient descent
    An, Jing
    Lu, Jianfeng
    Ying, Lexing
    INFORMATION AND INFERENCE-A JOURNAL OF THE IMA, 2020, 9 (04) : 851 - 873
  • [7] Asynchronous Stochastic Gradient Descent with Delay Compensation
    Zheng, Shuxin
    Meng, Qi
    Wang, Taifeng
    Chen, Wei
    Yu, Nenghai
    Ma, Zhi-Ming
    Liu, Tie-Yan
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 70, 2017, 70
  • [8] ASYNCHRONOUS STOCHASTIC GRADIENT DESCENT FOR DNN TRAINING
    Zhang, Shanshan
    Zhang, Ce
    You, Zhao
    Zheng, Rong
    Xu, Bo
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6660 - 6663
  • [9] Practical Efficiency of Asynchronous Stochastic Gradient Descent
    Bhardwaj, Onkar
    Cong, Guojing
    PROCEEDINGS OF 2016 2ND WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS (MLHPC), 2016, : 56 - 62
  • [10] Asynchronous Decentralized Accelerated Stochastic Gradient Descent
    Lan G.
    Zhou Y.
    Zhou, Yi (yi.zhou@ibm.com), 1600, Institute of Electrical and Electronics Engineers Inc. (02): : 802 - 811