HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework

被引:15
|
作者
Gandomi, Abolfazl [1 ]
Reshadi, Midia [1 ]
Movaghar, Ali [2 ]
Khademzadeh, Ahmad [3 ]
机构
[1] Islamic Azad Univ, Dept Comp Engn, Sci & Res Branch, Tehran, Iran
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
[3] Iran Telecommun Res Ctr, ITRC, Tehran, Iran
关键词
MapReduce; Scheduling; Hybrid algorithm; Data Locality; Dynamic priority; LOCALITY; PERFORMANCE;
D O I
10.1186/s40537-019-0253-9
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Due to the advent of new technologies, devices, and communication tools such as social networking sites, the amount of data produced by mankind is growing rapidly every year. Big data is a collection of large datasets that cannot be processed using traditional computing techniques. MapReduce has been introduced to solve large-data computational problems. It is specifically designed to run on commodity hardware, and it depends on dividing and conquering principles. Nowadays, the focus of researchers has shifted towards Hadoop MapReduce. One of the most outstanding characteristics of MapReduce is data locality-aware scheduling. Data locality-aware scheduler is a further efficient solution to optimize one or a set of performance metrics such as data locality, energy consumption and job completion time. Similar to all situations, time and scheduling are the most important aspects of the MapReduce framework. Therefore, many scheduling algorithms have been proposed in the past decades. The main ideas of these algorithms are increasing data locality rate and decreasing the response and completion time. In this paper, a new hybrid scheduling algorithm has been proposed, which uses dynamic priority and localization ID techniques and focuses on increasing data locality rate and decreasing completion time. The proposed algorithm was evaluated and compared with Hadoop default schedulers (FIFO, Fair), by running concurrent workloads consisting of Wordcount and Terasort benchmarks. The experimental results show that the proposed algorithm is faster than FIFO and Fair scheduling, achieves higher data locality rate and avoids wasting resources.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] HybSMRP: a hybrid scheduling algorithm in Hadoop MapReduce framework
    Abolfazl Gandomi
    Midia Reshadi
    Ali Movaghar
    Ahmad Khademzadeh
    [J]. Journal of Big Data, 6
  • [2] Improving the efficiency of MapReduce scheduling algorithm in Hadoop
    Thangaselvi, R.
    Ananthbabu, S.
    Jagadeesh, S.
    Aruna, R.
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON APPLIED AND THEORETICAL COMPUTING AND COMMUNICATION TECHNOLOGY (ICATCCT), 2015, : 63 - 68
  • [3] Memory and Performance Aware Scheduling Design for Hadoop MapReduce Framework
    Bakka, Jagadevi
    Lingareddy, Sanjeev C.
    [J]. BIOSCIENCE BIOTECHNOLOGY RESEARCH COMMUNICATIONS, 2020, 13 (13): : 242 - 246
  • [4] Implementation of Page Rank Algorithm in Hadoop MapReduce Framework
    Bhawivuga, Adhitya
    Kirana, Annisa Puspa
    [J]. 2016 INTERNATIONAL SEMINAR ON INTELLIGENT TECHNOLOGY AND ITS APPLICATIONS (ISITIA): RECENT TRENDS IN INTELLIGENT COMPUTATIONAL TECHNOLOGIES FOR SUSTAINABLE ENERGY, 2016, : 231 - 235
  • [5] Hadoop MapReduce Scheduling Paradigms
    Johannessen, Roger
    Yazidi, Anis
    Feng, Boning
    [J]. 2017 2ND IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA 2017), 2017, : 175 - 179
  • [6] An Expressive Hadoop MapReduce Framework
    Shah, Nathar
    Messom, Christopher
    [J]. ADVANCED SCIENCE LETTERS, 2017, 23 (11) : 11197 - 11201
  • [7] A Scheduling Algorithm for Hadoop MapReduce Workflows with Budget Constraints in the Heterogeneous Cloud
    Wylie, Andrew
    Shi, Wei
    Corriveau, Jean-Pierre
    Wang, Yang
    [J]. 2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1433 - 1442
  • [8] A review on job scheduling for hadoop mapreduce
    Kalia, Khushboo
    Gupta, Neeraj
    [J]. Proceedings - 2017 International Conference on Next Generation Computing and Information Systems, ICNGCIS 2017, 2018, : 86 - 91
  • [9] Scheduling for response time in Hadoop MapReduce
    Dai, Xiangming
    Bensaou, Brahim
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2016,
  • [10] A REVIEW ON JOB SCHEDULING FOR HADOOP MAPREDUCE
    Kalia, Khushboo
    Gupta, Neeraj
    [J]. 2017 INTERNATIONAL CONFERENCE ON NEXT GENERATION COMPUTING AND INFORMATION SYSTEMS (ICNGCIS), 2017, : 75 - 79