Network-Aware Optimization of Distributed Learning for Fog Computing

被引:20
|
作者
Wang, Su [1 ]
Ruan, Yichen [2 ]
Tu, Yuwei [3 ]
Wagle, Satyavrat [2 ]
Brinton, Christopher G. [1 ]
Joe-Wong, Carlee [2 ]
机构
[1] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
[2] Carnegie Mellon Univ, Dept Elect & Comp Engn, Mountain View, CA 94035 USA
[3] Aetna, New York, NY 10027 USA
关键词
Computational modeling; Data models; Servers; Training; Distributed databases; Task analysis; Network topology; Federated learning; offloading; fog computing;
D O I
10.1109/TNET.2021.3075432
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Fog computing promises to enable machine learning tasks to scale to large amounts of data by distributing processing across connected devices. Two key challenges to achieving this goal are (i) heterogeneity in devices' compute resources and (ii) topology constraints on which devices communicate with each other. We address these challenges by developing a novel network-aware distributed learning methodology where devices optimally share local data processing and send their learnt parameters to a server for periodic aggregation. Unlike traditional federated learning, our method enables devices to offload their data processing tasks to each other, with these decisions optimized to trade off costs associated with data processing, offloading, and discarding. We analytically characterize the optimal data transfer solution under different assumptions on the fog network scenario, showing for example that the value of offloading is approximately linear in the range of computing costs in the network when the cost of discarding is modeled as decreasing linearly in the amount of data processed at each node. Our experiments on real-world data traces from our testbed confirm that our algorithms improve network resource utilization substantially without sacrificing the accuracy of the learned model, for varying distributions of data across devices. We also investigate the effect of network dynamics on model learning and resource costs.
引用
收藏
页码:2019 / 2032
页数:14
相关论文
共 50 条
  • [1] Network-Aware Optimization of Distributed Learning for Fog Computing
    Tu, Yuwei
    Ruan, Yichen
    Wagle, Satyavrat
    Brinton, Christopher G.
    Joe-Wong, Carlee
    [J]. IEEE INFOCOM 2020 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2020, : 2509 - 2518
  • [2] Network-aware distributed computing: A case study
    Tangmunarunkit, H
    Steenkiste, P
    [J]. PARALLEL AND DISTRIBUTED PROCESSING, 1998, 1388 : 171 - 182
  • [3] Quality of Service Provision in Fog Computing: Network-Aware Scheduling of Containers
    Caminero, Agustin C.
    Munoz-Mansilla, Rocio
    [J]. SENSORS, 2021, 21 (12)
  • [4] Towards Network-Aware Resource Provisioning in Kubernetes for Fog Computing applications
    Santos, Jose
    Wauters, Tim
    Volckaert, Bruno
    De Turck, Filip
    [J]. PROCEEDINGS OF THE 2019 IEEE CONFERENCE ON NETWORK SOFTWARIZATION (NETSOFT 2019), 2019, : 351 - 359
  • [5] Network-Aware Distributed Machine Learning OverWide Area Network
    Zhou, Pan
    Sun, Gang
    Yu, Hongfang
    Chang, Victor
    [J]. MODERN INDUSTRIAL IOT, BIG DATA AND SUPPLY CHAIN, IIOTBDSC 2020, 2021, 218 : 55 - 62
  • [6] FedFog: Network-Aware Optimization of Federated Learning Over Wireless Fog-Cloud Systems
    Nguyen, Van-Dinh
    Chatzinotas, Symeon
    Ottersten, Bjorn
    Duong, Trung Q.
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (10) : 8581 - 8599
  • [7] Dynamic network-aware container allocation in Cloud/Fog computing with mobile nodes
    Tsokov, Tsvetan
    Kostadinov, Hristo
    [J]. INTERNET OF THINGS, 2024, 26
  • [8] Network-aware parallel computing with Remos
    Lowekamp, B
    Miller, N
    Sutherland, D
    Gross, T
    Steenkiste, P
    Subhlok, J
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING, 1999, 1656 : 100 - 119
  • [9] A distributed network-aware TSCH scheduling
    Vieira Junior, Ivanilson Franca
    Granjal, Jorge
    Curado, Marilia
    [J]. 2023 19TH INTERNATIONAL CONFERENCE ON THE DESIGN OF RELIABLE COMMUNICATION NETWORKS, DRCN, 2023,
  • [10] An Optimal Network-Aware Scheduling Technique for Distributed Deep Learning in Distributed HPC Platforms
    Lee, Sangkwon
    Shah, Syed Asif Raza
    Seok, Woojin
    Moon, Jeonghoon
    Kim, Kihyeon
    Shah, Syed Hasnain Raza
    [J]. ELECTRONICS, 2023, 12 (14)