An Enhanced Data-Locality-Aware Task Scheduling Algorithm for Hadoop Applications

被引:13
|
作者
Choi, Dongjoo [1 ]
Jeon, Myunghoon [1 ]
Kim, Namgi [1 ]
Lee, Byoung-Dai [1 ]
机构
[1] Kyonggi Univ, Comp Sci Dept, Suwon 443760, South Korea
来源
IEEE SYSTEMS JOURNAL | 2018年 / 12卷 / 04期
基金
新加坡国家研究基金会;
关键词
Data locality; Hadoop distributed file system (HDFS); MapReduce; task scheduling;
D O I
10.1109/JSYST.2017.2764481
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In general, Hadoop improves the task scheduling performance by determining data locality based on the location in which the input splits and MapTask are executed. However, if an input split consists of multiple data blocks that are distributed and stored in different nodes, this data location method fails to cope with the degradation in processing performance due to the increased frequency of data block copying. We propose a task scheduling algorithm that solves this issue by defining a method to classify data locality taking into account the location of all data blocks that comprise an input split, categorizing tasks based on the defined method, and sequentially assigning tasks according to a given priority. This study measures the performance of the proposed algorithm through a comparison of the total processing time, MapTask performance time, and data block copying frequency between the proposed algorithm and Hadoop's default task scheduling algorithm. The test results show that the proposed algorithm improved the total processing time by up to 25% and the data block copying frequency by up to 28%, when compared to the default algorithm.
引用
收藏
页码:3346 / 3357
页数:12
相关论文
共 50 条
  • [41] Data Volume-aware Computation Task Scheduling for Smart Grid Data Analytic Applications
    Guo, Binquan
    Li, Hongyan
    Yan, Ye
    Zhang, Zhou
    Wang, Peng
    [J]. ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 4113 - 4118
  • [42] Data-Driven Locality-Aware Batch Scheduling
    Gonthier, Maxime
    Larsson, Elisabeth
    Marchal, Loris
    Nettelblad, Carl
    Thibault, Samuel
    [J]. 2024 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS, IPDPSW 2024, 2024, : 202 - 211
  • [43] Locality-aware and load-balanced static task scheduling for MapReduce
    Selvitopi, Oguz
    Demirci, Gunduz Vehbi
    Turk, Ata
    Aykanat, Cevdet
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 90 : 49 - 61
  • [44] Improved Particle Optimization Algorithm Solving Hadoop Task Scheduling Problem
    Xu, Jun
    Tang, Yong
    [J]. PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND COGNITIVE INFORMATICS, 2015, : 11 - 14
  • [45] Task classification-aware data aggregation scheduling algorithm in wireless sensor networks
    Zou, Hongsen
    Li, Liang
    Ao, Chen
    Zhan, Puning
    Li, Ning
    Wang, Zheng
    [J]. International Journal of Mobile Network Design and Innovation, 2019, 9 (02): : 106 - 117
  • [46] Optimizing Load Balancing and Data-Locality with Data-aware Scheduling
    Wang, Ke
    Zhou, Xiaobing
    Li, Tonglin
    Zhao, Dongfang
    Lang, Michael
    Raicu, Ioan
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2014, : 119 - 128
  • [47] Enhanced Harris Hawks Optimization Algorithm for SLA-Aware Task Scheduling in Cloud Computing
    Liu, Junhua
    Lei, Chaoyang
    Yin, Gen
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2024, 15 (07) : 788 - 795
  • [48] Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers
    Cheng, Long
    Wang, Ying
    Liu, Qingzhi
    Epema, Dick H. J.
    Liu, Cheng
    Mao, Ying
    Murphy, John
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (06) : 1494 - 1510
  • [49] Data Locality and VM Interference Aware Mitigation of Data Skew in Hadoop Leveraging Modern Portfolio Theory
    Nabavinejad, Seyed Morteza
    Goudarzi, Maziar
    [J]. 33RD ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING, 2018, : 175 - 182
  • [50] An Energy and Data Locality Aware Bi-level Multiobjective Task Scheduling Model Based on MapReduce for Cloud Computing
    Wang, Xiaoli
    Wang, Yuping
    [J]. 2012 IEEE/WIC/ACM INTERNATIONAL CONFERENCE ON WEB INTELLIGENCE AND INTELLIGENT AGENT TECHNOLOGY (WI-IAT 2012), VOL 1, 2012, : 648 - 655