Compressed Coded Distributed Computing

被引:0
|
作者
Li, Songze [1 ]
Maddah-Ali, Mohammad Ali [2 ]
Avestimehr, A. Salman [1 ]
机构
[1] Univ Southern Calif, Dept Elect Engn, Los Angeles, CA 90089 USA
[2] Nokia Bell Labs, Holmdel, NJ USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Communication overhead is one of the major performance bottlenecks in large-scale distributed computing systems, especially for machine learning applications. Conventionally, compression techniques are used to reduce the load of communication by combining intermediate results of the same computation task as much as possible. Recently, via the development of coded distributed computing (CDC), it has been shown that it is possible to code across intermediate results of different tasks to further reduce communication. We propose a new scheme, named compressed coded distributed computing (in short, compressed CDC), which jointly exploits these two techniques (i.e., combining intermediate results of the same computation and coding across intermediate results of different computations) to significantly reduce the communication load for computations with linear aggregation of intermediate results in the final stage that are prevalent in machine learning (e.g., distributed training where partial gradients are computed distributedly and then averaged in the final stage). In particular, compressed CDC first compresses/combines several intermediate results for a single computation, and then utilizes multiple such combined packets to create a coded multicast packet that is simultaneously useful for multiple computations. We characterize the achievable communication load of compressed CDC and show that it substantially outperforms both combining methods and CDC scheme.
引用
收藏
页码:2032 / 2036
页数:5
相关论文
共 50 条
  • [21] Cascaded Coded Distributed Computing on Heterogeneous Networks
    Woolsey, Nicholas
    Chen, Rong-Rong
    Ji, Mingyue
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 2644 - 2648
  • [22] Masterless Coded Computing: A Fully-Distributed Coded FFT Algorithm
    Jeong, Haewon
    Low, Tze Meng
    Grover, Pulkit
    2018 56TH ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2018, : 887 - 894
  • [23] Coded Distributed Computing: Performance Limits and Code Designs
    Jamali, Mohammad Vahid
    Soleymani, Mahdi
    Mahdavifar, Hessam
    2019 IEEE INFORMATION THEORY WORKSHOP (ITW), 2019, : 399 - 403
  • [24] Coded Computing for Multi-Cluster Distributed Computations
    Wu, Youlong
    Li, Chenglin
    Hu, Haoyang
    Song, Xiyu
    Ma, Shuai
    Shi, Yuanming
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2025, 73 (02) : 1114 - 1127
  • [25] An Incentive Mechanism for Resource Allocation in Coded Distributed Computing
    Ng, Jer Shyuan
    Lim, Wei Yang Bryan
    Xiong, Zehui
    Garg, Sahil
    Niyato, Dusit
    Leung, Cyril
    2021 IEEE 15TH INTERNATIONAL CONFERENCE ON BIG DATA SCIENCE AND ENGINEERING (BIGDATASE 2021), 2021, : 15 - 22
  • [26] How to Optimally Allocate Resources for Coded Distributed Computing?
    Yu, Qian
    Li, Songze
    Maddah-Ali, Mohammad Ali
    Avestimehr, A. Salman
    2017 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2017,
  • [27] Coded Reactive Stragglers Mitigation in Distributed Computing Systems
    Ardakani, Maryam Haghighi
    Ardakani, Masoud
    Tellambura, Chintha
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 3750 - 3755
  • [28] Coded Distributed Computing over Packet Erasure Channels
    Han, Dong-Jun
    Sohn, Jy-Yong
    Moon, Jaekyun
    2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT), 2019, : 717 - 721
  • [29] Coded Reactive Stragglers Mitigation in Distributed Computing Systems
    Ardakani, Maryam Haghighi
    Ardakani, Masoud
    Tellambura, Chintha
    IEEE TRANSACTIONS ON COMMUNICATIONS, 2024, 72 (08) : 4527 - 4537
  • [30] Coded Distributed Computing: Fundamental Limits and Practical Challenges
    Li, Songze
    Yu, Qian
    Maddah-Ali, Mohammad Ali
    Avestimehr, A. Salman
    2016 50TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS AND COMPUTERS, 2016, : 509 - 513