Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent

被引：0

作者：

Shen, Shuheng ^{[1
,2
]}

Xu, Linli ^{[1
,2
]}

Liu, Jingchang ^{[1
,2
]}

Liang, Xianfeng ^{[1
,2
]}

Cheng, Yifei ^{[1
,3
]}

机构：

[1] Univ Sci & Technol China, Anhui Prov Key Lab Big Data Anal & Applicat, Hefei, Anhui, Peoples R China

[2] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei, Anhui, Peoples R China

[3] Univ Sci & Technol China, Sch Data Sci, Hefei, Anhui, Peoples R China

来源：

PROCEEDINGS OF THE TWENTY-EIGHTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2019年

基金：

中国国家自然科学基金;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

With the increase in the amount of data and the expansion of model scale, distributed parallel training becomes an important and successful technique to address the optimization challenges. Nevertheless, although distributed stochastic gradient descent (SGD) algorithms can achieve a linear iteration speedup, they are limited significantly in practice by the communication cost, making it difficult to achieve a linear time speedup. In this paper, we propose a computation and communication decoupled stochastic gradient descent (CoCoD-SGD) algorithm to run computation and communication in parallel to reduce the communication cost. We prove that CoCoD-SGD has a linear iteration speedup with respect to the total computation capability of the hardware resources. In addition, it has a lower communication complexity and better time speedup comparing with traditional distributed SGD algorithms. Experiments on deep neural network training demonstrate the significant improvements of CoCoD-SGD: when training ResNetl8 and VGG16 with 16 Geforce GTX 1080Ti GPUs, CoCoD-SGD is up to 2-3 x faster than traditional synchronous SGD.

引用

页码：4582 / 4589

页数：8

共 50 条

[31] A Sharp Estimate on the Transient Time of Distributed Stochastic Gradient Descent
Pu, Shi
Olshevsky, Alex
Paschalidis, Ioannis Ch
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (11) : 5900 - 5915
[32] A Distributed Optimal Control Problem with Averaged Stochastic Gradient Descent
Sun, Qi
Du, Qiang
[J]. COMMUNICATIONS IN COMPUTATIONAL PHYSICS, 2020, 27 (03) : 753 - 774
[33] Scaling Stratified Stochastic Gradient Descent for Distributed Matrix Completion
Abubaker, Nabil
Karsavuran, M. Ozan
Aykanat, Cevdet
[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 35 (10) : 10603 - 10615
[34] ON DISTRIBUTED STOCHASTIC GRADIENT DESCENT FOR NONCONVEX FUNCTIONS IN THE PRESENCE OF BYZANTINES
Bulusu, Saikiran
Khanduri, Prashant
Sharma, Pranay
Varshney, Pramod K.
[J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3137 - 3141
[35] Convergence in High Probability of Distributed Stochastic Gradient Descent Algorithms
Lu, Kaihong
Wang, Hongxia
Zhang, Huanshui
Wang, Long
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (04) : 2189 - 2204
[36] A Novel Distributed Variant of Stochastic Gradient Descent and Its Optimization
Wang, Yi-qi
Zhao, Ya-wei
Shi, Zhan
Yin, Jian-ping
[J]. INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER SCIENCE (AICS 2016), 2016, : 486 - 492
[37] Distributed Differentially Private Stochastic Gradient Descent: An Empirical Study
Hegedus, Istvan
Jelasity, Mark
[J]. 2016 24TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP), 2016, : 566 - 573
[38] Fast Convergence for Stochastic and Distributed Gradient Descent in the Interpolation Limit
Mitra, Partha P.
[J]. 2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 1890 - 1894
[39] CONTROLLING STOCHASTIC GRADIENT DESCENT USING STOCHASTIC APPROXIMATION FOR ROBUST DISTRIBUTED OPTIMIZATION
Jain, Adit
Krishnamurthy, Vikram
[J]. NUMERICAL ALGEBRA CONTROL AND OPTIMIZATION, 2024,
[40] On Stochastic Roundoff Errors in Gradient Descent with Low-Precision Computation
Lu Xia
Stefano Massei
Michiel E. Hochstenbach
Barry Koren
[J]. Journal of Optimization Theory and Applications, 2024, 200 : 634 - 668

← 1 2 3 4 5 →