Load Balancing in Heterogeneous MapReduce Environments

被引:2
|
作者
Fan, Yuanquan [1 ]
Wu, Weiguo [1 ]
Qian, Depei [1 ]
Xu, Yunlong [1 ]
Wei, Wei [1 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, Xian 710049, Peoples R China
关键词
MapReduce; Load Balancing; heterogeneous cluster; heterogeneity-aware partitioning;
D O I
10.1109/HPCC.and.EUC.2013.209
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
MapReduce has emerged as a popular computing model for parallel processing of big data. However, we observe that the native hash partitioning of MapReduce systems leads to frequent uneven data distribution among reduce tasks. The uneven data distribution results in load imbalance among reduce tasks, and thus hampers the performance of MapReduce systems. Moreover, the heterogeneity among cluster nodes exacerbates the negative effects of uneven data distribution due to varying performance of the heterogeneous nodes. To address the above issues, in this paper, we propose a novel load balancing approach with respect to the heterogeneity of clusters. This approach consists of two components: (1) performance estimation for reducers that run on heterogeneous nodes based on history of reduce tasks, and (2) heterogeneity-aware partitioning (HAP), which reallocates the input data for reduce tasks based on the performance estimation for reducers. We implement this approach as a plug-in of current MapReduce system. Experiment results show that our approach improves the performance of MapReduce jobs that run in heterogeneous systems, and incurs little overhead.
引用
收藏
页码:1480 / 1489
页数:10
相关论文
共 50 条
  • [21] Predicting Job Completion Time In Heterogeneous MapReduce Environments
    Singhal, Rekha
    Verma, Abhishek
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 17 - 27
  • [22] Performance Modeling of MapReduce Jobs in Heterogeneous Cloud Environments
    Zhang, Zhuoyao
    Cherkasova, Ludmila
    Boon Thau Loo
    [J]. 2013 IEEE SIXTH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD 2013), 2013, : 839 - 846
  • [23] A Learning-based MapReduce Scheduler in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2020 - 2025
  • [24] The Research of MapReduce Load Balancing Based on Multiple Partition Algorithm
    Wang, Suzhen
    Zhou, Haowei
    [J]. 2016 IEEE/ACM 9TH INTERNATIONAL CONFERENCE ON UTILITY AND CLOUD COMPUTING (UCC), 2016, : 339 - 342
  • [25] Improving Load Balancing for MapReduce-based Entity Matching
    Mestre, Demetrio Gomes
    Santos Pires, Carlos Eduardo
    [J]. 2013 IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICATIONS (ISCC), 2013,
  • [26] A Review of Adaptive Approaches to MapReduce Scheduling in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    [J]. 2014 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2014, : 677 - 683
  • [27] Skew-Tolerant Key Distribution for Load Balancing in MapReduce
    Son, Jihoon
    Choi, Hyunsik
    Chung, Yon Dohn
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2012, E95D (02) : 677 - 680
  • [28] Load balancing in join algorithms for skewed data in MapReduce systems
    Gavagsaz, Elaheh
    Rezaee, Ali
    Javadi, Hamid Haj Seyyed
    [J]. JOURNAL OF SUPERCOMPUTING, 2019, 75 (01): : 228 - 254
  • [29] An Efficient Load Balancing Strategy Based on MapReduce for Public Cloud
    Ragmani, Awatif
    El Omri, Amina
    Abghour, Noreddine
    Moussaid, Khalid
    Rida, Mohamed
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON INTERNET OF THINGS, DATA AND CLOUD COMPUTING (ICC 2017), 2017,
  • [30] Load balancing in join algorithms for skewed data in MapReduce systems
    Elaheh Gavagsaz
    Ali Rezaee
    Hamid Haj Seyyed Javadi
    [J]. The Journal of Supercomputing, 2019, 75 : 228 - 254