Communication-Efficient Algorithms for Statistical Optimization

被引:0
|
作者
Zhang, Yuchen [1 ]
Duchi, John C. [1 ]
Wainwright, Martin J. [2 ]
机构
[1] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
关键词
distributed learning; stochastic optimization; averaging; subsampling; STOCHASTIC-APPROXIMATION;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
We analyze two communication-efficient algorithms for distributed optimization in statistical settings involving large-scale data sets. The first algorithm is a standard averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error (MSE) that decays as O(N-1 + (N/m)(-2)). Whenever m <= root N, this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of bootstrap subsampling. Requiring only a single round of communication, it has mean-squared error that decays as O(N-1 + (N/m)(-3)), and so is more robust to the amount of parallelization. In addition, we show that a stochastic gradient-based method attains mean-squared error decaying as O(N-1 + (N/m)(-3/2)), easing computation at the expense of a potentially slower MSE rate. We also provide an experimental evaluation of our methods, investigating their performance both on simulated data and on a large-scale regression problem from the internet search domain. In particular, we show that our methods can be used to efficiently solve an advertisement prediction problem from the Chinese SoSo Search Engine, which involves logistic regression with N approximate to 2.4 x 10(8) samples and d approximate to 740,000 covariates.
引用
收藏
页码:3321 / 3363
页数:43
相关论文
共 50 条
  • [31] Communication-Efficient Adam-Type Algorithms for Distributed Data Mining
    Xian, Wenhan
    Huang, Feihu
    Huang, Heng
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1245 - 1250
  • [32] Stochastic Communication-Efficient Distributed Algorithms for Solving Linear Algebraic Equations
    Gao, Xiaobin
    Liu, Ji
    Basar, Tamer
    [J]. 2016 IEEE CONFERENCE ON CONTROL APPLICATIONS (CCA), 2016,
  • [33] Communication-efficient and Byzantine-robust distributed learning with statistical guarantee
    Zhou, Xingcai
    Chang, Le
    Xu, Pengfei
    Lv, Shaogao
    [J]. PATTERN RECOGNITION, 2023, 137
  • [34] A Communication-Efficient Algorithm for Federated Multilevel Stochastic Compositional Optimization
    Yang, Shuoguang
    Li, Fengpei
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2024, 72 : 2333 - 2347
  • [35] Communication-Efficient Distributed Minimax Optimization via Markov Compression
    Yang, Linfeng
    Zhang, Zhen
    Che, Keqin
    Yang, Shaofu
    Wang, Suyang
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 540 - 551
  • [36] Communication-efficient algorithms on reconfigurable array of processors with spanning optical buses
    Hamdi, M
    Pan, Y
    [J]. SECOND INTERNATIONAL SYMPOSIUM ON PARALLEL ARCHITECTURES, ALGORITHMS, AND NETWORKS (I-SPAN '96), PROCEEDINGS, 1996, : 440 - 446
  • [37] A Communication-Efficient Local Differentially Private Algorithm in Federated Optimization
    Alam, Syed Eqbal
    Shukla, Dhirendra
    Rao, Shrisha
    [J]. IEEE ACCESS, 2023, 11 : 58254 - 58268
  • [38] Innovation Compression for Communication-Efficient Distributed Optimization With Linear Convergence
    Zhang, Jiaqi
    You, Keyou
    Xie, Lihua
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2023, 68 (11) : 6899 - 6906
  • [39] Communication-efficient ADMM-based distributed algorithms for sparse training
    Wang, Guozheng
    Lei, Yongmei
    Qiu, Yongwen
    Lou, Lingfei
    Li, Yixin
    [J]. NEUROCOMPUTING, 2023, 550
  • [40] FedCO: Communication-Efficient Federated Learning via Clustering Optimization
    Al-Saedi, Ahmed A.
    Boeva, Veselka
    Casalicchio, Emiliano
    [J]. FUTURE INTERNET, 2022, 14 (12)