Distributed Stochastic Gradient Descent With Compressed and Skipped Communication

被引:0
|
作者
Phuong, Tran Thi [1 ]
Phong, Le Trieu [2 ]
Fukushima, Kazuhide [1 ]
机构
[1] KDDI Res Inc, Saitama 3568502, Japan
[2] Natl Inst Informat & Commun Technol NICT, Tokyo 1848795, Japan
关键词
Compressed sensing; Gradient methods; Compressed and skipped communication; distributed stochastic gradient descent; deep learning; OPTIMIZATION;
D O I
10.1109/ACCESS.2023.3315331
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper introduces CompSkipDSGD, a new algorithm for distributed stochastic gradient descent that aims to improve communication efficiency by compressing and selectively skipping communication. In addition to compression, CompSkipDSGD allows both workers and the server to skip communication in any iteration of the training process and reserve it for future iterations without significantly decreasing testing accuracy. Our experimental results on the large-scale ImageNet dataset demonstrate that CompSkipDSGD can save hundreds of gigabytes of communication while maintaining similar levels of accuracy compared to state-of-the-art algorithms. The experimental results are supported by a theoretical analysis that demonstrates the convergence of CompSkipDSGD under established assumptions. Overall, CompSkipDSGD could be useful for reducing communication costs in distributed deep learning and enabling the use of large-scale datasets and models in complex environments.
引用
收藏
页码:99836 / 99846
页数:11
相关论文
共 50 条
  • [1] Communication-Censored Distributed Stochastic Gradient Descent
    Li, Weiyu
    Wu, Zhaoxian
    Chen, Tianyi
    Li, Liping
    Ling, Qing
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6831 - 6843
  • [2] Distributed Stochastic Gradient Descent with Event-Triggered Communication
    George, Jemin
    Gurram, Prudhvi
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7169 - 7178
  • [3] Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks
    Zhang, Xin
    Liu, Jia
    Zhu, Zhengyuan
    Bentley, Elizabeth S.
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2019), 2019, : 2431 - 2439
  • [4] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [5] Online Distributed Stochastic Gradient Algorithm for Nonconvex Optimization With Compressed Communication
    Li, Jueyou
    Li, Chaojie
    Fan, Jing
    Huang, Tingwen
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (02) : 936 - 951
  • [6] An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent
    Tu, Jun
    Zhou, Jia
    Ren, Donglin
    [J]. COMPUTER COMMUNICATIONS, 2022, 195 : 416 - 423
  • [7] Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
    Li, Zhize
    Kovalev, Dmitry
    Qian, Xun
    Richtarik, Peter
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
  • [8] Predicting Throughput of Distributed Stochastic Gradient Descent
    Li, Zhuojin
    Paolieri, Marco
    Golubchik, Leana
    Lin, Sung-Han
    Yan, Wumo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
  • [9] Distributed stochastic gradient descent with discriminative aggregating
    Chen, Zhen-Hong
    Lan, Yan-Yan
    Guo, Jia-Feng
    Cheng, Xue-Qi
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (10): : 2054 - 2063
  • [10] Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
    Shen, Shuheng
    Xu, Linli
    Liu, Jingchang
    Liang, Xianfeng
    Cheng, Yifei
    [J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4582 - 4589