Asynchronous Distributed Optimization with Stochastic Delays

被引：0

作者：

Glasgow, Margalit ^{[1
]}

Wootters, Mary ^{[1
]}

机构：

[1] Stanford Univ, Stanford, CA 94305 USA

来源：

INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151 | 2022年 / 151卷

关键词：

CONVERGENCE RATE;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We study asynchronous finite sum minimization in a distributed-data setting with a central parameter server. While asynchrony is well understood in parallel settings where the data is accessible by all machines-e.g., modifications of variance-reduced gradient algorithms like SAGA work well-little is known for the distributed-data setting. We develop an algorithm ADSAGA based on SAGA for the distributed-data setting, in which the data is partitioned between many machines. We show that with m machines, under a natural stochastic delay model with an mean delay of m, ADSAGA converges in (O) over tilde ((n + root m kappa) log(1/epsilon)) iterations, where n is the number of component functions, and kappa is a condition number. This complexity sits squarely between the complexity (O) over tilde ((n + kappa) log(1/epsilon)) of SAGA without delays and the complexity (O) over tilde ((n + m kappa)log(1/epsilon)) of parallel asynchronous algorithms where the delays are arbitrary (but bounded by O(m)), and the data is accessible by all. Existing asynchronous algorithms with distributed-data setting and arbitrary delays have only been shown to converge in (O) over tilde (n(2)kappa log(1/epsilon)) iterations. We empirically compare the iteration complexity and wallclock performance of ADSAGA to existing parallel and distributed algorithms, including synchronous minibatch algorithms. Our results demonstrate the wallclock advantage of variance-reduced asynchronous approaches over SGD or synchronous approaches.

引用

页数：33

共 50 条

[1] Asynchronous Stochastic Optimization Robust to Arbitrary Delays
Cohen, Alon
Daniely, Amit
Drori, Yoel
Koren, Tomer
Schain, Mariano
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[2] DISTRIBUTED ASYNCHRONOUS ALGORITHMS WITH STOCHASTIC DELAYS FOR CONSTRAINED OPTIMIZATION PROBLEMS WITH CONDITIONS OF TIME DRIFT
BEIDAS, BF
PAPAVASSILOPOULOS, GP
[J]. PARALLEL COMPUTING, 1995, 21 (09) : 1431 - 1450
[3] Distributed Asynchronous Constrained Stochastic Optimization
Srivastava, Kunal
Nedic, Angelia
[J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2011, 5 (04) : 772 - 790
[4] Distributed Stochastic Optimization with Large Delays
Zhou, Zhengyuan
Mertikopoulos, Panayotis
Bambos, Nicholas
Glynn, Peter
Ye, Yinyu
[J]. MATHEMATICS OF OPERATIONS RESEARCH, 2021, 47 (03) : 2082 - 2111
[5] Asynchronous Distributed Semi-Stochastic Gradient Optimization
Zhang, Ruiliang
Zheng, Shuai
Kwok, James T.
[J]. THIRTIETH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2016, : 2323 - 2329
[6] DISTRIBUTED ASYNCHRONOUS DETERMINISTIC AND STOCHASTIC GRADIENT OPTIMIZATION ALGORITHMS
TSITSIKLIS, JN
BERTSEKAS, DP
ATHANS, M
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1986, 31 (09) : 803 - 812
[7] Distributed Asynchronous Optimization with Unbounded Delays: How Slow Can You Go?
Zhou, Zhengyuan
Mertikopoulos, Panayotis
Bambos, Nicholas
Glynn, Peter
Ye, Yinyu
Li, Li-Jia
Li Fei-Fei
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 80, 2018, 80
[8] A Stochastic Primal-Dual algorithm for Distributed Asynchronous Composite Optimization
Bianchi, Pascal
Hachem, Walid
Iutzeler, Franck
[J]. 2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 732 - 736
[9] CONVERGENCE ANALYSIS OF ASYNCHRONOUS LINEAR ITERATIONS WITH STOCHASTIC DELAYS
BEIDAS, BF
PAPAVASSILOPOULOS, GP
[J]. PARALLEL COMPUTING, 1993, 19 (03) : 281 - 302
[10] Clustering in stochastic asynchronous algorithms for distributed simulations
Manita, A
Simonot, F
[J]. STOCHASTIC ALGORITHMS: FOUNDATIONS AND APPLICATIONS, PROCEEDINGS, 2005, 3777 : 26 - 37

← 1 2 3 4 5 →