Improving MapReduce scheduler for heterogeneous workloads in a heterogeneous environment

被引:8
|
作者
Jeyaraj, Rathinaraja [1 ]
Ananthanarayana, V. S. [1 ]
Paul, Anand [2 ]
机构
[1] Natl Inst Technol Karnataka, Dept IT, Mangalore, Karnataka, India
[2] Kyungpook Natl Univ, Sch Comp Sci & Engn, 80 Daehakro, Daegu 702701, South Korea
来源
基金
新加坡国家研究基金会;
关键词
bin packing; heterogeneous workloads; jobs; map; reduce task placement; DATA PLACEMENT; BIG DATA; PERFORMANCE;
D O I
10.1002/cpe.5558
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Big data is largely influencing business entities and research sectors to be more data-driven. Hadoop MapReduce is one of the cost-effective ways to process large scale datasets and offered as a service over the Internet. Even though cloud service providers promise an infinite amount of resources available on-demand, it is inevitable that some of the hired virtual resources for MapReduce are left unutilized and makespan is limited due to various heterogeneities that exist while offering MapReduce as a service. As MapReduce v2 allows users to define the size of containers for the map and reduce tasks, jobs in a batch become heterogeneous and behave differently. Also, the different capacity of virtual machines in the MapReduce virtual cluster accommodate a varying number of map/reduce tasks. These factors highly affect resource utilization in the virtual cluster and the makespan for a batch of MapReduce jobs. Default MapReduce job schedulers do not consider these heterogeneities that exist in a cloud environment. Moreover, virtual machines in MapReduce virtual cluster process an equal number of blocks regardless of their capacity, which affects the makespan. Therefore, we devised a heuristic-based MapReduce job scheduler that exploits virtual machine and MapReduce workload level heterogeneities to improve resource utilization and makespan. We proposed two methods to achieve this: (i) roulette wheel scheme based data block placement in heterogeneous virtual machines, and (ii) a constrained 2-dimensional bin packing to place heterogeneous map/reduce tasks. We compared heuristic-based MapReduce job scheduler against the classical fair scheduler in MapReduce v2. Experimental results showed that our proposed scheduler improved makespan and resource utilization by 45.6% and 47.9% over classical fair scheduler.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] A data locality based scheduler to enhance MapReduce performance in heterogeneous environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Bapu, Tapas B. R.
    Anitha, R.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 90 : 423 - 434
  • [22] Performance Prediction Model in Heterogeneous MapReduce Environment
    Fan, Yuanquan
    Wu, Weiguo
    Xu, Yunlong
    Cao, Yangjie
    Li, Qian
    Cui, Jinhua
    Duan, Zhangfeng
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (CIT), 2014, : 240 - 245
  • [23] Analysis of hadoop MapReduce scheduling in heterogeneous environment
    Kalia, Khushboo
    Gupta, Neeraj
    [J]. AIN SHAMS ENGINEERING JOURNAL, 2021, 12 (01) : 1101 - 1110
  • [24] Insight and Reduction of MapReduce Stragglers in Heterogeneous Environment
    Zhao, Xia
    Kang, Kai
    Sun, YuZhong
    Song, Yin
    Xu, Minhao
    Pan, Tao
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2013,
  • [25] A Hardware-based HEFT Scheduler Implementation for Dynamic Workloads on Heterogeneous SoCs
    Fusco, Alexander
    Hassan, Sahil
    Mack, Joshua
    Akoglu, Ali
    [J]. PROCEEDINGS OF THE 2022 IFIP/IEEE 30TH INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC), 2022,
  • [26] Improving Performance by Matching Imbalanced Workloads with Heterogeneous Platforms
    Shen, Jie
    Varbanescu, Ana Lucia
    Zou, Peng
    Lu, Yutong
    Sips, Henk
    [J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 241 - 250
  • [27] Design Dynamic Data Allocation Scheduler to Improve MapReduce Performance in Heterogeneous Clouds
    Yang, Shin-Jer
    Chen, Yi-Ru
    Hsieh, Yung-Ming
    [J]. 2012 NINTH IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE), 2012, : 265 - 270
  • [28] Design adaptive task allocation scheduler to improve MapReduce performance in heterogeneous clouds
    Yang, Shin-Jer
    Chen, Yi-Ru
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2015, 57 : 61 - 70
  • [29] Enhancing the Performance of MapReduce Default Scheduler by Detecting Prolonged TaskTrackers in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 2, 2016, 380 : 225 - 233
  • [30] Improving MapReduce Performance in a Heterogeneous Cloud: A Measurement Study
    Zhao, Xu
    Liu, Ling
    Zhang, Qi
    Dong, Xiaoshe
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 401 - 408