Distributed Mirror Descent for Stochastic Learning over Rate-limited Networks

被引:0
|
作者
Nokleby, Matthew [1 ]
Bajwa, Waheed U. [2 ]
机构
[1] Wayne State Univ, Detroit, MI 48202 USA
[2] Rutgers State Univ, Piscataway, NJ USA
关键词
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
We present and analyze two algorithms-termed distributed stochastic approximation mirror descent (D-SAMD) and accelerated distributed stochastic approximation mirror descent (AD-SAMD) for distributed, stochastic optimization from high-rate data streams over rate-limited networks. Devices contend with fast streaming rates by mini-batching samples in the data stream, and they collaborate via distributed consensus to compute variance-reduced averages of distributed subgradients. This induces a trade-off: Mini-batching slows down the effective streaming rate, but may also slow down convergence. We present two theoretical contributions that characterize this trade-off: (i) bounds on the convergence rates of D-SAMD and AD-SAMD, and (ii) sufficient conditions for order-optimum convergence of D-SAMD and AD-SAMD, in terms of the network size/topology and the ratio of the data streaming and communication rates. We find that AD-SAMD achieves order-optimum convergence in a larger regime than D-SAMD. We demonstrate the effectiveness of the proposed algorithms using numerical experiments.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
    Liu, Kangqiao
    Liu Ziyin
    Ueda, Masahito
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [32] Convergence diagnostics for stochastic gradient descent with constant learning rate
    Chee, Jerry
    Toulis, Panos
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [33] Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
    Vasudevan, Shrihari
    ENTROPY, 2020, 22 (05)
  • [34] Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning: Examining Distributed and Centralized Stochastic Gradient Descent
    Pu, Shi
    Olshevsky, Alex
    Paschalidis, Ioannis Ch.
    IEEE SIGNAL PROCESSING MAGAZINE, 2020, 37 (03) : 114 - 122
  • [35] A Novel Stochastic Gradient Descent Algorithm Based on Grouping over Heterogeneous Cluster Systems for Distributed Deep Learning
    Jiang, Wenbin
    Ye, Geyan
    Yang, Laurence T.
    Zhu, Jian
    Ma, Yang
    Xie, Xia
    Jin, Hai
    2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 391 - 398
  • [36] A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
    Shi, Shaohuai
    Wang, Qiang
    Chu, Xiaowen
    Li, Bo
    2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), 2018, : 425 - 432
  • [37] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204
  • [38] Learning curves for stochastic gradient descent in linear feedforward networks
    Werfel, J
    Xie, XH
    Seung, HS
    NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
  • [39] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
    Teng, Yunfei
    Gao, Wenbo
    Chalus, Francois
    Choromanska, Anna
    Goldfarb, Donald
    Weller, Adrian
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
  • [40] Limited Stochastic Meta-Descent for Kernel-Based Online Learning
    He, Wenwu
    NEURAL COMPUTATION, 2009, 21 (09) : 2667 - 2686