Communication-Efficient Algorithms for Statistical Optimization

被引：0

作者：

Zhang, Yuchen ^{[1
]}

Duchi, John C. ^{[1
]}

Wainwright, Martin J. ^{[2
]}

机构：

[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA

[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA

来源：

JOURNAL OF MACHINE LEARNING RESEARCH | 2013年 / 14卷

关键词：

distributed learning; stochastic optimization; averaging; subsampling; STOCHASTIC-APPROXIMATION;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

We analyze two communication-efficient algorithms for distributed optimization in statistical settings involving large-scale data sets. The first algorithm is a standard averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error (MSE) that decays as O(N-1 + (N/m)(-2)). Whenever m <= root N, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of bootstrap subsampling. Requiring only a single round of communication, it has mean-squared error that decays as O(N-1 + (N/m)(-3)), and so is more robust to the amount of parallelization. In addition, we show that a stochastic gradient-based method attains mean-squared error decaying as O(N-1 + (N/m)(-3/2)), easing computation at the expense of a potentially slower MSE rate. We also provide an experimental evaluation of our methods, investigating their performance both on simulated data and on a large-scale regression problem from the internet search domain. In particular, we show that our methods can be used to efficiently solve an advertisement prediction problem from the Chinese SoSo Search Engine, which involves logistic regression with N approximate to 2.4 x 10(8) samples and d approximate to 740,000 covariates.

引用

页码：3321 / 3363

页数：43

共 50 条

[31] Communication-Efficient Adam-Type Algorithms for Distributed Data Mining
Xian, Wenhan
Huang, Feihu
Huang, Heng
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1245 - 1250
[32] Stochastic Communication-Efficient Distributed Algorithms for Solving Linear Algebraic Equations
Gao, Xiaobin
Liu, Ji
Basar, Tamer
[J]. 2016 IEEE CONFERENCE ON CONTROL APPLICATIONS (CCA), 2016,
[33] Communication-efficient and Byzantine-robust distributed learning with statistical guarantee
Zhou, Xingcai
Chang, Le
Xu, Pengfei
Lv, Shaogao
[J]. PATTERN RECOGNITION, 2023, 137
[34] A Communication-Efficient Algorithm for Federated Multilevel Stochastic Compositional Optimization
Yang, Shuoguang
Li, Fengpei
[J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2333 - 2347
[35] Communication-Efficient Distributed Minimax Optimization via Markov Compression
Yang, Linfeng
Zhang, Zhen
Che, Keqin
Yang, Shaofu
Wang, Suyang
[J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 540 - 551
[36] Communication-efficient algorithms on reconfigurable array of processors with spanning optical buses
Hamdi, M
Pan, Y
[J]. SECOND INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS, AND NETWORKS (I-SPAN '96), PROCEEDINGS, 1996, : 440 - 446
[37] A Communication-Efficient Local Differentially Private Algorithm in Federated Optimization
Alam, Syed Eqbal
Shukla, Dhirendra
Rao, Shrisha
[J]. IEEE ACCESS, 2023, 11 : 58254 - 58268
[38] Innovation Compression for Communication-Efficient Distributed Optimization With Linear Convergence
Zhang, Jiaqi
You, Keyou
Xie, Lihua
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (11) : 6899 - 6906
[39] Communication-efficient ADMM-based distributed algorithms for sparse training
Wang, Guozheng
Lei, Yongmei
Qiu, Yongwen
Lou, Lingfei
Li, Yixin
[J]. NEUROCOMPUTING, 2023, 550
[40] FedCO: Communication-Efficient Federated Learning via Clustering Optimization
Al-Saedi, Ahmed A.
Boeva, Veselka
Casalicchio, Emiliano
[J]. FUTURE INTERNET, 2022, 14 (12)

← 1 2 3 4 5 →