Distributed Mirror Descent for Stochastic Learning over Rate-limited Networks

被引：0

作者：

Nokleby, Matthew ^{[1
]}

Bajwa, Waheed U. ^{[2
]}

机构：

[1] Wayne State Univ, Detroit, MI 48202 USA

[2] Rutgers State Univ, Piscataway, NJ USA

来源：

2017 IEEE 7TH INTERNATIONAL WORKSHOP ON COMPUTATIONAL ADVANCES IN MULTI-SENSOR ADAPTIVE PROCESSING (CAMSAP) | 2017年

关键词：

D O I：

暂无

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

We present and analyze two algorithms-termed distributed stochastic approximation mirror descent (D-SAMD) and accelerated distributed stochastic approximation mirror descent (AD-SAMD) for distributed, stochastic optimization from high-rate data streams over rate-limited networks. Devices contend with fast streaming rates by mini-batching samples in the data stream, and they collaborate via distributed consensus to compute variance-reduced averages of distributed subgradients. This induces a trade-off: Mini-batching slows down the effective streaming rate, but may also slow down convergence. We present two theoretical contributions that characterize this trade-off: (i) bounds on the convergence rates of D-SAMD and AD-SAMD, and (ii) sufficient conditions for order-optimum convergence of D-SAMD and AD-SAMD, in terms of the network size/topology and the ratio of the data streaming and communication rates. We find that AD-SAMD achieves order-optimum convergence in a larger regime than D-SAMD. We demonstrate the effectiveness of the proposed algorithms using numerical experiments.

引用

页数：5

共 50 条

[31] Noise and Fluctuation of Finite Learning Rate Stochastic Gradient Descent
Liu, Kangqiao
Liu Ziyin
Ueda, Masahito
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
[32] Convergence diagnostics for stochastic gradient descent with constant learning rate
Chee, Jerry
Toulis, Panos
INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
[33] Mutual Information Based Learning Rate Decay for Stochastic Gradient Descent Training of Deep Neural Networks
Vasudevan, Shrihari
ENTROPY, 2020, 22 (05)
[34] Asymptotic Network Independence in Distributed Stochastic Optimization for Machine Learning: Examining Distributed and Centralized Stochastic Gradient Descent
Pu, Shi
Olshevsky, Alex
Paschalidis, Ioannis Ch.
IEEE SIGNAL PROCESSING MAGAZINE, 2020, 37 (03) : 114 - 122
[35] A Novel Stochastic Gradient Descent Algorithm Based on Grouping over Heterogeneous Cluster Systems for Distributed Deep Learning
Jiang, Wenbin
Ye, Geyan
Yang, Laurence T.
Zhu, Jian
Ma, Yang
Xie, Xia
Jin, Hai
2019 19TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2019, : 391 - 398
[36] A DAG Model of Synchronous Stochastic Gradient Descent in Distributed Deep Learning
Shi, Shaohuai
Wang, Qiang
Chu, Xiaowen
Li, Bo
2018 IEEE 24TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS 2018), 2018, : 425 - 432
[37] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 16, 2004, 16 : 1197 - 1204
[38] Learning curves for stochastic gradient descent in linear feedforward networks
Werfel, J
Xie, XH
Seung, HS
NEURAL COMPUTATION, 2005, 17 (12) : 2699 - 2718
[39] Leader Stochastic Gradient Descent for Distributed Training of Deep Learning Models
Teng, Yunfei
Gao, Wenbo
Chalus, Francois
Choromanska, Anna
Goldfarb, Donald
Weller, Adrian
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 32 (NIPS 2019), 2019, 32
[40] Limited Stochastic Meta-Descent for Kernel-Based Online Learning
He, Wenwu
NEURAL COMPUTATION, 2009, 21 (09) : 2667 - 2686

← 1 2 3 4 5 →