Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

被引:8
|
作者
Jeyaraj, Rathinaraja [1 ]
Ananthanarayana, V. S. [1 ]
Paul, Anand [2 ]
机构
[1] Natl Inst Technol Karnataka, Dept IT, Mangalore, Karnataka, India
[2] Kyungpook Natl Univ, Sch Comp Sci & Engn, 80 Daehakro, Daegu 702701, South Korea
来源
基金
新加坡国家研究基金会;
关键词
bin packing; heterogeneous workloads; jobs; map; reduce task placement; DATA PLACEMENT; BIG DATA; PERFORMANCE;
D O I
10.1002/cpe.5558
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Big data is largely influencing business entities and research sectors to be more data-driven. Hadoop MapReduce is one of the cost-effective ways to process large scale datasets and offered as a service over the Internet. Even though cloud service providers promise an infinite amount of resources available on-demand, it is inevitable that some of the hired virtual resources for MapReduce are left unutilized and makespan is limited due to various heterogeneities that exist while offering MapReduce as a service. As MapReduce v2 allows users to define the size of containers for the map and reduce tasks, jobs in a batch become heterogeneous and behave differently. Also, the different capacity of virtual machines in the MapReduce virtual cluster accommodate a varying number of map/reduce tasks. These factors highly affect resource utilization in the virtual cluster and the makespan for a batch of MapReduce jobs. Default MapReduce job schedulers do not consider these heterogeneities that exist in a cloud environment. Moreover, virtual machines in MapReduce virtual cluster process an equal number of blocks regardless of their capacity, which affects the makespan. Therefore, we devised a heuristic-based MapReduce job scheduler that exploits virtual machine and MapReduce workload level heterogeneities to improve resource utilization and makespan. We proposed two methods to achieve this: (i) roulette wheel scheme based data block placement in heterogeneous virtual machines, and (ii) a constrained 2-dimensional bin packing to place heterogeneous map/reduce tasks. We compared heuristic-based MapReduce job scheduler against the classical fair scheduler in MapReduce v2. Experimental results showed that our proposed scheduler improved makespan and resource utilization by 45.6% and 47.9% over classical fair scheduler.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment
    Jeyaraj, Rathinaraja
    Ananthanarayana, V. S.
    Paul, Anand
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2020, 32 (17):
  • [2] A Dynamic MapReduce Scheduler for Heterogeneous Workloads
    Tian, Chao
    Zhou, Haojie
    He, Yongqiang
    Zha, Li
    [J]. 2009 EIGHTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2009, : 218 - 224
  • [3] MapReduce Scheduler Using Classifiers for Heterogeneous Workloads
    Visalakshi, P.
    Karthik, T. U.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2011, 11 (04): : 68 - 73
  • [4] Job Classification for MapReduce Scheduler in Heterogeneous Environment
    Deshmukh, Shyam
    Aghav, J. V.
    Chakravarthy, Rohan
    [J]. 2013 INTERNATIONAL CONFERENCE ON CLOUD & UBIQUITOUS COMPUTING & EMERGING TECHNOLOGIES (CUBE 2013), 2013, : 26 - +
  • [5] A Usage-Aware Scheduler for Improving MapReduce Performance in Heterogeneous Environments
    Hsiao, J. H.
    Kao, S. J.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1647 - +
  • [6] An Adaptive MapReduce Scheduler for Scalable Heterogeneous Systems
    Ghoneem, Mohammad
    Kulkarni, Lalit
    [J]. PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON DATA ENGINEERING AND COMMUNICATION TECHNOLOGY, ICDECT 2016, VOL 2, 2017, 469 : 603 - 611
  • [7] MixHeter: A global scheduler for mixed workloads in heterogeneous environments
    Zhang, Xiao
    Lyu, Yinrun
    Wu, Yanjun
    Zhao, Chen
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 111 : 93 - 103
  • [8] A Learning-based MapReduce Scheduler in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2020 - 2025
  • [9] POSUM: A Portfolio Scheduler for MapReduce Workloads
    Voinea, Maria A.
    Uta, Alexandru
    Iosup, Alexandru
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2018, : 351 - 357
  • [10] DyScale: A MapReduce Job Scheduler for Heterogeneous Multicore Processors
    Yan, Feng
    Cherkasova, Ludmila
    Zhang, Zhuoyao
    Smirni, Evgenia
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2017, 5 (02) : 317 - 330