MAC Aware Quantization for Distributed Gradient Descent

被引:6
|
作者
Chang, Wei-Ting [1 ]
Tandon, Ravi [1 ]
机构
[1] Univ Arizona, Dept Elect & Comp Engn, Tucson, AZ 85721 USA
关键词
D O I
10.1109/GLOBECOM42002.2020.9322254
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we study the problem of federated learning (FL), where distributed users aim to jointly train a machine learning model with the help of a parameter server (PS). In each iteration of FL, users compute local gradients, followed by transmission of the quantized gradients for subsequent aggregation and model updates at PS. One of the challenges of FL is that of communication overhead due to FL's iterative nature and large model sizes. One recent direction to alleviate communication bottleneck in FL is to let users communicate simultaneously over a multiple access channel (MAC), possibly making better use of the communication resources. In this paper, we consider the problem of FL over a MAC. We focus on the design of digital gradient transmission schemes over a MAC, where gradients at each user are first quantized, and then transmitted over a MAC to be decoded individually at the PS. When designing digital FL schemes over MACs, there are new opportunities to assign different amount of resources (e.g., rate or bandwidth) to different users based on a) the informativeness of the gradients at users, and b) the underlying channel conditions. We propose a stochastic gradient quantization scheme, where the quantization parameters are optimized based on the capacity region of the MAC. We show that such channel aware quantization for FL outperforms uniform quantization, particularly when users experience different channel conditions, and when have gradients with varying levels of informativeness.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Bayesian Distributed Stochastic Gradient Descent
    Teng, Michael
    Wood, Frank
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 31 (NIPS 2018), 2018, 31
  • [2] Accelerated Distributed Nesterov Gradient Descent
    Qu, Guannan
    Li, Na
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (06) : 2566 - 2581
  • [3] Distributed Gradient Descent for Functional Learning
    Yu, Zhan
    Fan, Jun
    Shi, Zhongjie
    Zhou, Ding-Xuan
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (09) : 6547 - 6571
  • [4] Directed-Distributed Gradient Descent
    Xi, Chenguang
    Khan, Usman A.
    2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2015, : 1022 - 1026
  • [5] DISTRIBUTED GRADIENT DESCENT WITH CODED PARTIAL GRADIENT COMPUTATIONS
    Ozfatura, E.
    Ulukus, S.
    Gunduz, D.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3492 - 3496
  • [6] Kernelized vector quantization in gradient-descent learning
    Villmann, Thomas
    Haase, Sven
    Kaden, Marika
    NEUROCOMPUTING, 2015, 147 : 83 - 95
  • [7] Channel Pruning in Quantization-aware Training: an Adaptive Projection-gradient Descent-shrinkage-splitting Method
    Li, Zhijian
    Xin, Jack
    2022 5TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES, AI4I, 2022, : 31 - 34
  • [8] Predicting Throughput of Distributed Stochastic Gradient Descent
    Li, Zhuojin
    Paolieri, Marco
    Golubchik, Leana
    Lin, Sung-Han
    Yan, Wumo
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (11) : 2900 - 2912
  • [9] Distributed pairwise algorithms with gradient descent methods
    Wang, Baobin
    Hu, Ting
    NEUROCOMPUTING, 2019, 333 : 364 - 373
  • [10] Distributed stochastic gradient descent with discriminative aggregating
    Chen, Zhen-Hong
    Lan, Yan-Yan
    Guo, Jia-Feng
    Cheng, Xue-Qi
    Jisuanji Xuebao/Chinese Journal of Computers, 2015, 38 (10): : 2054 - 2063