Adaptive Load Balancing for Parameter Servers in Distributed Machine Learning over Heterogeneous Networks

被引:1
|
作者
CAI Weibo [1 ]
YANG Shulin [1 ]
SUN Gang [1 ]
ZHANG Qiming [2 ]
YU Hongfang [1 ]
机构
[1] University of Electronic Science and Technology of China
[2] ZTE Corporation
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TN91 [通信]; TP181 [自动推理、机器学习];
学科分类号
0810 ; 081001 ; 081104 ; 0812 ; 0835 ; 1405 ;
摘要
In distributed machine learning(DML) based on the parameter server(PS) architecture, unbalanced communication load distribution of PSs will lead to a significant slowdown of model synchronization in heterogeneous networks due to low utilization of bandwidth. To address this problem, a network-aware adaptive PS load distribution scheme is proposed, which accelerates model synchronization by proactively adjusting the communication load on PSs according to network states. We evaluate the proposed scheme on MXNet, known as a realworld distributed training platform, and results show that our scheme achieves up to 2.68 times speed-up of model training in the dynamic and heterogeneous network environment.
引用
收藏
页码:72 / 80
页数:9
相关论文
共 50 条
  • [31] Distributed load balancing algorithm for heterogeneous underwater acoustic sensor networks
    He, Ming (paper_review@126.com), 2017, Science Press (38):
  • [32] A Distributed Load Balancing Algorithm for LTE/LTE-A Heterogeneous Networks
    Castro-Hernandez, Diego
    Paranjape, Raman
    2015 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE WORKSHOPS (WCNCW), 2015, : 380 - 385
  • [33] DRPS: efficient disk-resident parameter servers for distributed machine learning
    Song, Zhen
    Gu, Yu
    Wang, Zhigang
    Yu, Ge
    FRONTIERS OF COMPUTER SCIENCE, 2022, 16 (04)
  • [34] DRPS: efficient disk-resident parameter servers for distributed machine learning
    Zhen Song
    Yu Gu
    Zhigang Wang
    Ge Yu
    Frontiers of Computer Science, 2022, 16
  • [35] DRPS:efficient disk-resident parameter servers for distributed machine learning
    Zhen SONG
    Yu GU
    Zhigang WANG
    Ge YU
    Frontiers of Computer Science, 2022, 16 (04) : 81 - 92
  • [36] Reinforcement Learning based Load Balancing in a Distributed Heterogeneous Storage System
    Park, Jooyoung
    Jeong, Seunghwan
    Woo, Honguk
    36TH INTERNATIONAL CONFERENCE ON INFORMATION NETWORKING (ICOIN 2022), 2022, : 482 - 485
  • [37] An adaptive load balancing algorithm for heterogeneous distributed systems with multiple task classes
    Lu, C
    Lau, SM
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 1996, : 629 - 636
  • [38] Load Balancing algorithms in SDN networks with multiple servers
    Caiza, Alex Ramiro Masaquiza
    Quimbita, Denis Andres Maigualema
    Tropea, Mauro
    DISRUPTIVE TECHNOLOGIES IN INFORMATION SCIENCES VIII, 2024, 13058
  • [39] MMPacking: A load and storage balancing algorithm for distributed multimedia servers
    Foundation for Research and, Technology-Hellas, Crete, Greece
    IEEE Trans Circuits Syst Video Technol, 1 (13-17):
  • [40] MMPacking: A load and storage balancing algorithm for distributed multimedia servers
    Serpanos, DN
    Georgiadis, L
    Bouloutas, T
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 1998, 8 (01) : 13 - 17