Residual Traffic Based Task Scheduling in Hadoop

被引:0
|
作者
Tanaka, Daichi [1 ]
Kawarasaki, Masatoshi [2 ]
机构
[1] Univ Tsukuba, Grad Sch Lib Informat & Media Studies, Tsukuba, Ibaraki, Japan
[2] Univ Tsukuba, Fac Lib Informat & Media Sci, Tsukuba, Ibaraki, Japan
关键词
distributed computing; Hadoop; MapReduce; job performance; network simulation;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In Hadoop job processing, it is reported that a large amount of data transfer significantly influences job performance. In this paper, we clarify that the cause of performance deterioration in the CPU (Central Processing Unit) heterogeneous environment is the delay of copy phase due to the heavy load in the inter rack links of the cluster network. Thus, we propose a new scheduling method-Residual Traffic Based Task Scheduling-that estimates the amount of inter rack data transfer in the copy phase and regulates task assignment accordingly. We evaluate the scheduling method by using ns-3 (network simulator-3) and show that it can improve Hadoop job performance significantly.
引用
收藏
页码:94 / 102
页数:9
相关论文
共 50 条
  • [1] Evaluating Task Scheduling in Hadoop-based Cloud Systems
    Liu, Shengyuan
    Xu, Jungang
    Liu, Zongzhen
    Liu, Xu
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 2013,
  • [2] A Task Scheduling Algorithm for Hadoop Platform
    Chen, Jilan
    Wang, Dan
    Zhao, Wenbing
    [J]. JOURNAL OF COMPUTERS, 2013, 8 (04) : 929 - 936
  • [3] An Optimal Task Selection Scheme for Hadoop Scheduling
    Suresh, S.
    Gopalan, N. P.
    [J]. INTERNATIONAL CONFERENCE ON FUTURE INFORMATION ENGINEERING (FIE 2014), 2014, 10 : 70 - 75
  • [4] Hadoop Job Scheduling with Dynamic Task Splitting
    Xu, YongLiang
    Cai, Wentong
    [J]. 2015 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING RESEARCH AND INNOVATION (ICCCRI), 2015, : 120 - 129
  • [5] A CLOUD COMPUTING MODEL BASED ON HADOOP WITH AN OPTIMIZATION OF ITS TASK SCHEDULING ALGORITHMS
    Hao, Yulu
    Song, Meina
    Han, Jing
    Song, Junde
    [J]. ICEIS 2011: PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON ENTERPRISE INFORMATION SYSTEMS, VOL 1, 2011, : 524 - 528
  • [6] CATS: cache-aware task scheduling for Hadoop-based systems
    Byungnam Lim
    Jong Wook Kim
    Yon Dohn Chung
    [J]. Cluster Computing, 2017, 20 : 3691 - 3705
  • [7] CATS: cache-aware task scheduling for Hadoop-based systems
    Lim, Byungnam
    Kim, Jong Wook
    Chung, Yon Dohn
    [J]. CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2017, 20 (04): : 3691 - 3705
  • [8] An improved task scheduling algorithm based on cache locality and data locality in Hadoop
    Zhang, Peng
    Li, Chunlin
    Zhao, Yahui
    [J]. 2016 17TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING, APPLICATIONS AND TECHNOLOGIES (PDCAT), 2016, : 244 - 249
  • [9] RTSBL: Reduce Task Scheduling Based on the Load Balancing and the Data Locality in Hadoop
    Midoun, Khadidja
    Hidouci, Walid-Khaled
    Loudini, Malik
    Belayadi, Djahida
    [J]. ADVANCES IN COMPUTING SYSTEMS AND APPLICATIONS, 2019, 50 : 271 - 280
  • [10] Performance optimization of computing task scheduling based on the Hadoop big data platform
    Li, Yang
    Hei, Xinhong
    [J]. NEURAL COMPUTING & APPLICATIONS, 2022,