On Datacenter-Network-Aware Load Balancing in MapReduce

被引:2
|
作者
Le, Yanfang [1 ]
Wang, Feng [2 ]
Liu, Jiangchuan [1 ]
Ergun, Funda [1 ,3 ]
机构
[1] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
[2] Univ Mississippi, University, MS 38677 USA
[3] Indiana Univ Bloomington, Bloomington, IN USA
关键词
D O I
10.1109/CLOUD.2015.71
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce has emerged as a powerful tool for distributed and scalable processing of voluminous data. For skewed data input, load balancing is necessary among the MapReduce worker nodes to minimize the overall finishing time, which however can incur massive data movement in a datacenter network. In this paper, we for the first time examine this problem of datacenter-network-aware load balancing in the shuffle subphase in MapReduce. Different from earlier studies that generally assume the network inside a datacenter has negligible delay and infinite capacity, we consider the traffic and bottlenecks in real datacenter networks by introducing the constraints on available network bandwidth, and demonstrate that the corresponding problem can be decomposed into two subproblems for network flow and load balancing, respectively. We show effective solutions to both of them, which together yield a complete solution towards near optimal datacenter-network-aware load balancing. A much simpler yet performance-wise comparable greedy algorithm is also developed for fast implementation in practice. The effectiveness of our solution has been demonstrated on synthetic and real public datasets.
引用
收藏
页码:485 / 492
页数:8
相关论文
共 50 条
  • [1] Deadline-Aware Load Balancing for MapReduce
    Lai, Zhao-Rong
    Chang, Che-Wei
    Liu, Xue
    Kuo, Tei-Wei
    Hsiu, Pi-Cheng
    [J]. 2014 IEEE 20TH INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA), 2014,
  • [2] Flow distribution-aware load balancing for the datacenter
    Wang, Shuo
    Zhang, Jiao
    Huang, Tao
    Pan, Tian
    Liu, Jiang
    Liu, Yunjie
    [J]. COMPUTER COMMUNICATIONS, 2017, 106 : 136 - 146
  • [3] Topology-Aware Load Balancing in Datacenter Networks
    Khan, Tahir Abbas
    Khan, Muhammad Saeed
    Abbas, Sagheer
    Janjua, Jamshaid Iqbal
    Muhammad, Syed Shah
    Asif, Muhammad
    [J]. 2021 IEEE ASIA PACIFIC CONFERENCE ON WIRELESS AND MOBILE (APWIMOB), 2021, : 220 - 225
  • [4] A Simple Congestion-Aware Algorithm for Load Balancing in Datacenter Networks
    Shafiee, Mehrnoosh
    Ghaderi, Javad
    [J]. IEEE-ACM TRANSACTIONS ON NETWORKING, 2017, 25 (06) : 3670 - 3682
  • [5] A Simple Congestion-Aware Algorithm for Load Balancing in Datacenter Networks
    Shafiee, Mehrnoosh
    Ghaderi, Javad
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [6] PLB: Adaptive Partial Congestion-aware Load Balancing for Datacenter Networks
    Liu, Kefei
    Zhang, Jiao
    Wei, Dehui
    Zhang, Kai
    Huang, Tao
    [J]. 2020 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2020,
  • [7] 1FDALB: Flow Distribution Aware Load Balancing for Datacenter Networks
    Wang, Shuo
    Zhang, Jiao
    Huang, Tao
    Pan, Tian
    Liu, Jiang
    Liu, Yunjie
    [J]. 2016 IEEE/ACM 24TH INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS), 2016,
  • [8] GLIB: A Global and Local Integrated Load Balancing Scheme for Datacenter Network
    Duan, Chen
    Peng, Wei
    Wang, Baosheng
    [J]. 2022 14TH INTERNATIONAL CONFERENCE ON COMMUNICATION SOFTWARE AND NETWORKS (ICCSN 2022), 2022, : 166 - 173
  • [9] Cost-aware load balancing for multilingual record linkage using MapReduce
    Medhat, Doaa
    Yousef, Ahmed H.
    Salama, Cherif
    [J]. AIN SHAMS ENGINEERING JOURNAL, 2020, 11 (02) : 419 - 433
  • [10] Resilient Datacenter Load Balancing in the Wild
    Zhang, Hong
    Zhang, Junxue
    Bai, Wei
    Chen, Kai
    Chowdhury, Mosharaf
    [J]. SIGCOMM '17: PROCEEDINGS OF THE 2017 CONFERENCE OF THE ACM SPECIAL INTEREST GROUP ON DATA COMMUNICATION, 2017, : 253 - 266