Communication-Censored Distributed Stochastic Gradient Descent

被引:11
|
作者
Li, Weiyu [1 ,2 ]
Wu, Zhaoxian [1 ,3 ,4 ]
Chen, Tianyi [5 ]
Li, Liping [6 ]
Ling, Qing [1 ,3 ,4 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, Guangzhou 510006, Peoples R China
[2] Univ Sci & Technol China, Sch Gifted Young, Hefei 230026, Peoples R China
[3] Sun Yat Sen Univ, Guangdong Prov Key Lab Computat Sci, Guangzhou 510006, Peoples R China
[4] Pazhou Lab, Guangzhou 510300, Peoples R China
[5] Rensselaer Polytech Inst, Dept Elect Comp & Syst Engn, Troy, NY 12180 USA
[6] Univ Sci & Technol China, Dept Automat, Hefei 230027, Peoples R China
关键词
Servers; Convergence; Optimization; Stochastic processes; Machine learning algorithms; Sun; Signal processing algorithms; Communication censoring; communication efficiency; distributed optimization; stochastic gradient descent (SGD); ALGORITHMS;
D O I
10.1109/TNNLS.2021.3083655
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article develops a communication-efficient algorithm to solve the stochastic optimization problem defined over a distributed network, aiming at reducing the burdensome communication in applications, such as distributed machine learning. Different from the existing works based on quantization and sparsification, we introduce a communication-censoring technique to reduce the transmissions of variables, which leads to our communication-censored distributed stochastic gradient descent (CSGD) algorithm. Specifically, in CSGD, the latest minibatch stochastic gradient at a worker will be transmitted to the server if and only if it is sufficiently informative. When the latest gradient is not available, the stale one will be reused at the server. To implement this communication-censoring strategy, the batch size is increasing in order to alleviate the effect of stochastic gradient noise. Theoretically, CSGD enjoys the same order of convergence rate as that of SGD but effectively reduces communication. Numerical experiments demonstrate the sizable communication saving of CSGD.
引用
收藏
页码:6831 / 6843
页数:13
相关论文
共 50 条
  • [1] Distributed Stochastic Gradient Descent With Compressed and Skipped Communication
    Phuong, Tran Thi
    Phong, Le Trieu
    Fukushima, Kazuhide
    [J]. IEEE ACCESS, 2023, 11 : 99836 - 99846
  • [2] Distributed Stochastic Gradient Descent with Event-Triggered Communication
    George, Jemin
    Gurram, Prudhvi
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 7169 - 7178
  • [3] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [4] COKE: Communication-censored decentralized kernel learning
    Xu, Ping
    Wang, Yue
    Chen, Xiang
    Tian, Zhi
    [J]. Journal of Machine Learning Research, 2021, 22 : 1 - 35
  • [5] COKE: Communication-Censored Decentralized Kernel Learning
    Xu, Ping
    Wang, Yue
    Chen, Xiang
    Tian, Zhi
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 22
  • [6] Communication-Censored ADMM for Decentralized Consensus Optimization
    Liu, Yaohua
    Xu, Wei
    Wu, Gang
    Tian, Zhi
    Ling, Qing
    [J]. IEEE TRANSACTIONS ON SIGNAL PROCESSING, 2019, 67 (10) : 2565 - 2579
  • [7] An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent
    Tu, Jun
    Zhou, Jia
    Ren, Donglin
    [J]. COMPUTER COMMUNICATIONS, 2022, 195 : 416 - 423
  • [8] Predicting Throughput of Distributed Stochastic Gradient Descent
    Li, Zhuojin
    Paolieri, Marco
    Golubchik, Leana
    Lin, Sung-Han
    Yan, Wumo
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
  • [9] Distributed stochastic gradient descent with discriminative aggregating
    Chen, Zhen-Hong
    Lan, Yan-Yan
    Guo, Jia-Feng
    Cheng, Xue-Qi
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (10): : 2054 - 2063
  • [10] Communication-Censored Linearized ADMM for Decentralized Consensus Optimization
    Li, Weiyu
    Liu, Yaohua
    Tian, Zhi
    Ling, Qing
    [J]. IEEE TRANSACTIONS ON SIGNAL AND INFORMATION PROCESSING OVER NETWORKS, 2020, 6 (01): : 18 - 34