A data locality based scheduler to enhance MapReduce performance in heterogeneous environments

被引:37
|
作者
Naik, Nenavath Srinivas [1 ]
Negi, Atul [1 ]
Bapu, Tapas B. R. [2 ]
Anitha, R. [2 ]
机构
[1] Univ Hyderabad, Sch Comp & Informat Sci, Hyderabad 500046, India
[2] SA Engn Coll, Madras, Tamil Nadu, India
关键词
MapReduce; Data locality; Task scheduler; Heterogeneous environments; PATH;
D O I
10.1016/j.future.2018.07.043
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
MapReduce is an essential framework for distributed storage and parallel processing for large-scale data-intensive jobs proposed in recent times. Hadoop default scheduler assumes homogeneous environment. This assumption of homogeneity does not work at all times in practice and limits the performance of MapReduce. Data locality is essentially moving computation closer (faster access) to the input data. Fundamentally, MapReduce does not always look into the heterogeneity from a data locality perspective. Improving data locality for MapReduce framework is an important issue to improve the performance of large-scale Hadoop clusters. This paper proposes a novel data locality based scheduler which allocates input data blocks to the nodes based on their processing capacity. Also schedules map andreduce tasks to the nodes based on their computing ability in the heterogeneous Hadoop cluster. We evaluate proposed scheduler using different workloads from Hi-Bench benchmark suite. The experimental results prove that our proposed scheduler enhances the MapReduce performance in heterogeneous environments. Minimizes job execution time, and also improves data locality for different parameters as compared to the Hadoop default scheduler, Matchmaking scheduler and Delay scheduler respectively. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:423 / 434
页数:12
相关论文
共 50 条
  • [1] A Learning-based MapReduce Scheduler in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    [J]. 2017 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATIONS AND INFORMATICS (ICACCI), 2017, : 2020 - 2025
  • [2] A Usage-Aware Scheduler for Improving MapReduce Performance in Heterogeneous Environments
    Hsiao, J. H.
    Kao, S. J.
    [J]. 2014 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE, ELECTRONICS AND ELECTRICAL ENGINEERING (ISEEE), VOLS 1-3, 2014, : 1647 - +
  • [3] Enhancing the Performance of MapReduce Default Scheduler by Detecting Prolonged TaskTrackers in Heterogeneous Environments
    Naik, Nenavath Srinivas
    Negi, Atul
    Sastry, V. N.
    [J]. PROCEEDINGS OF THE SECOND INTERNATIONAL CONFERENCE ON COMPUTER AND COMMUNICATION TECHNOLOGIES, IC3T 2015, VOL 2, 2016, 380 : 225 - 233
  • [4] TMaR: a two-stage MapReduce scheduler for heterogeneous environments
    Maleki, Neda
    Faragardi, Hamid Reza
    Rahmani, Amir Masoud
    Conti, Mauro
    Lofstead, Jay
    [J]. HUMAN-CENTRIC COMPUTING AND INFORMATION SCIENCES, 2020, 10 (01)
  • [5] Design Dynamic Data Allocation Scheduler to Improve MapReduce Performance in Heterogeneous Clouds
    Yang, Shin-Jer
    Chen, Yi-Ru
    Hsieh, Yung-Ming
    [J]. 2012 NINTH IEEE INTERNATIONAL CONFERENCE ON E-BUSINESS ENGINEERING (ICEBE), 2012, : 265 - 270
  • [6] Improving MapReduce Performance by Data Prefetching in Heterogeneous or Shared Environments
    Gu, Tao
    Zuo, Chuang
    Liao, Qun
    Yang, Yulu
    Li, Tao
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2013, 6 (05): : 71 - 81
  • [7] A Predictive Map Task Scheduler for Optimizing Data Locality in MapReduce Clusters
    Merabet, Mohamed
    Benslimane, Sidi Mohamed
    Barhamgi, Mahmoud
    Bonnet, Christine
    [J]. INTERNATIONAL JOURNAL OF GRID AND HIGH PERFORMANCE COMPUTING, 2018, 10 (04) : 1 - 14
  • [8] Locality Based Data Partitioning in MapReduce
    Wang, Chunguang
    Wu, Qingbo
    Tan, Yusong
    Wang, Wenzhu
    Wu, Quanyuan
    [J]. 2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 1310 - 1317
  • [9] A Dynamic MapReduce Scheduler for Heterogeneous Workloads
    Tian, Chao
    Zhou, Haojie
    He, Yongqiang
    Zha, Li
    [J]. 2009 EIGHTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2009, : 218 - 224
  • [10] IDaPS - Improved data-locality aware data placement strategy based on Markov clustering to enhance MapReduce performance on Hadoop
    Vengadeswaran, S.
    Balasundaram, S. R.
    Dhavakumar, P.
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2024, 36 (03)