Distributed Stochastic Gradient Descent With Compressed and Skipped Communication

被引：0

作者：

Phuong, Tran Thi ^{[1
]}

Phong, Le Trieu ^{[2
]}

Fukushima, Kazuhide ^{[1
]}

机构：

[1] KDDI Res Inc, Saitama 3568502, Japan

[2] Natl Inst Informat & Commun Technol NICT, Tokyo 1848795, Japan

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Compressed sensing; Gradient methods; Compressed and skipped communication; distributed stochastic gradient descent; deep learning; OPTIMIZATION;

D O I：

10.1109/ACCESS.2023.3315331

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper introduces CompSkipDSGD, a new algorithm for distributed stochastic gradient descent that aims to improve communication efficiency by compressing and selectively skipping communication. In addition to compression, CompSkipDSGD allows both workers and the server to skip communication in any iteration of the training process and reserve it for future iterations without significantly decreasing testing accuracy. Our experimental results on the large-scale ImageNet dataset demonstrate that CompSkipDSGD can save hundreds of gigabytes of communication while maintaining similar levels of accuracy compared to state-of-the-art algorithms. The experimental results are supported by a theoretical analysis that demonstrates the convergence of CompSkipDSGD under established assumptions. Overall, CompSkipDSGD could be useful for reducing communication costs in distributed deep learning and enabling the use of large-scale datasets and models in complex environments.

引用

页码：99836 / 99846

页数：11

共 50 条

[1] Communication-Censored Distributed Stochastic Gradient Descent
Li, Weiyu
Wu, Zhaoxian
Chen, Tianyi
Li, Liping
Ling, Qing
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (11) : 6831 - 6843
[2] Distributed Stochastic Gradient Descent with Event-Triggered Communication
George, Jemin
Gurram, Prudhvi
[J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7169 - 7178
[3] Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks
Zhang, Xin
Liu, Jia
Zhu, Zhengyuan
Bentley, Elizabeth S.
[J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2019), 2019, : 2431 - 2439
[4] Bayesian Distributed Stochastic Gradient Descent
Teng, Michael
Wood, Frank
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
[5] Online Distributed Stochastic Gradient Algorithm for Nonconvex Optimization With Compressed Communication
Li, Jueyou
Li, Chaojie
Fan, Jing
Huang, Tingwen
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (02) : 936 - 951
[6] An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent
Tu, Jun
Zhou, Jia
Ren, Donglin
[J]. COMPUTER COMMUNICATIONS, 2022, 195 : 416 - 423
[7] Acceleration for Compressed Gradient Descent in Distributed and Federated Optimization
Li, Zhize
Kovalev, Dmitry
Qian, Xun
Richtarik, Peter
[J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 119, 2020, 119
[8] Predicting Throughput of Distributed Stochastic Gradient Descent
Li, Zhuojin
Paolieri, Marco
Golubchik, Leana
Lin, Sung-Han
Yan, Wumo
[J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
[9] Distributed stochastic gradient descent with discriminative aggregating
Chen, Zhen-Hong
Lan, Yan-Yan
Guo, Jia-Feng
Cheng, Xue-Qi
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (10): : 2054 - 2063
[10] Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent
Shen, Shuheng
Xu, Linli
Liu, Jingchang
Liang, Xianfeng
Cheng, Yifei
[J]. PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2019, : 4582 - 4589

← 1 2 3 4 5 →