Joint Scheduling of Tasks and Network Flows in Big Data Clusters

被引:5
|
作者
Yang, Lei [1 ]
Liu, Xuxun [2 ]
Cao, Jiannong [3 ]
Wang, Zhenyu [1 ]
机构
[1] South China Univ Technol, Sch Software Engn, Guangzhou 510006, Guangdong, Peoples R China
[2] South China Univ Technol, Sch Elect & Informat Engn, Guangzhou 510006, Guangdong, Peoples R China
[3] Hong Kong Polytech Univ, Dept Comp, Hong Hom, Hong Kong, Peoples R China
来源
IEEE ACCESS | 2018年 / 6卷
基金
中国国家自然科学基金;
关键词
Task scheduling; flow scheduling; data centers; software defined networks; STRATEGY;
D O I
10.1109/ACCESS.2018.2878864
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
As an increasing number of big data processing platforms like Hadoop, Spark, and Storm appear and normally share the resources in the data center, it has been important and challenging to schedule various jobs from these platforms onto the underlying data center resources such that the overall job completion time is minimized. To solve the problem, the existing work either focus on the task-level scheduling techniques, such as Quincy and delay scheduling, or focus on the network flow scheduling techniques, such as D3 and preemptive distributed quick. These works deal with the scheduling of tasks and network flows separately and cannot achieve optimal performance. The reason is that the task scheduling without regard of the available network bandwidths may generate the task placement that causes serious network congestions and thus leads to long data transmission time. In this paper, we propose the joint scheduling technique by coordinating the task placement and the scheduling of network flows arising from these tasks. We develop a software-defined network (SDN)-based online scheduling framework which selects the task placement based on the available bandwidth on the SDN switches and at meanwhile optimally allocates the bandwidth to each data flow. Comprehensive trace-driven simulations show that the joint scheduling technique can take full use of the network bandwidth and thus reduce the job completion time by 55% on average compared with the benchmark methods.
引用
收藏
页码:66600 / 66611
页数:12
相关论文
共 50 条
  • [1] Robust Task Scheduling Strategy for Big Data Clusters
    Wang, Zixiang
    Liu, Zhoubin
    Huan, Zhan
    Kong, Xiaoyun
    Yuan, Xiaolu
    [J]. 2017 3RD INTERNATIONAL CONFERENCE ON BIG DATA COMPUTING AND COMMUNICATIONS (BIGCOM), 2017, : 305 - 312
  • [2] A New Approach for Scheduling Tasks and/or Jobs in Big Data Cluster
    Hadjar, Karim
    Jedidi, Ahmed
    [J]. 2019 4TH MEC INTERNATIONAL CONFERENCE ON BIG DATA AND SMART CITY (ICBDSC), 2019, : 191 - 194
  • [3] When Network Matters: Data Center Scheduling with Network Tasks
    Giroire, F.
    Huin, N.
    Tomassilli, A.
    Perennes, S.
    [J]. IEEE CONFERENCE ON COMPUTER COMMUNICATIONS (IEEE INFOCOM 2019), 2019, : 2278 - 2286
  • [4] DynDL: Scheduling Data-Locality-Aware Tasks with Dynamic Data Transfer Cost for Multicore-Server-Based Big Data Clusters
    Jin, Jiahui
    An, Qi
    Zhou, Wei
    Tang, Jiakai
    Xiong, Runqun
    [J]. APPLIED SCIENCES-BASEL, 2018, 8 (11):
  • [5] A Survey of Scheduling Tasks in Big Data: Apache Spark<bold> </bold>
    Hasan, Balqees Talal
    Abdullah, Dhuha Basheer
    [J]. MICRO-ELECTRONICS AND TELECOMMUNICATION ENGINEERING, ICMETE 2021, 2022, 373 : 405 - 414
  • [6] Joint Optimization of the Partition and Scheduling of DNN Tasks in Computing and Network Convergence
    Zhang, Zhenyu
    Li, Qin
    Lu, Lu
    Guo, Da
    Zhang, Yong
    [J]. IEEE Networking Letters, 2023, 5 (02): : 130 - 134
  • [7] Hypergraph-partitioning-based online joint scheduling of tasks and data
    Song, Yao
    Wang, Liang
    Xiao, Limin
    Wei, Wei
    Scherer, Rafal
    Qin, Guangjun
    Wang, Jinquan
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (14): : 16088 - 16117
  • [8] Hypergraph-partitioning-based online joint scheduling of tasks and data
    Yao Song
    Liang Wang
    Limin Xiao
    Wei Wei
    Rafał Scherer
    Guangjun Qin
    Jinquan Wang
    [J]. The Journal of Supercomputing, 2022, 78 : 16088 - 16117
  • [9] Dynamic Scheduling for Emergency Tasks in Space Data Relay Network
    Dai, Cui-Qin
    Li, Chong
    Fu, Shu
    Zhao, Jian
    Chen, Qianbin
    [J]. IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2021, 70 (01) : 795 - 807
  • [10] Data verification tasks scheduling based on dynamic resource allocation in mobile big data storage
    Xu, Guangwei
    Bai, Yanke
    Pan, Qiao
    Huang, Qiubo
    Yang, Yanbin
    [J]. COMPUTER NETWORKS, 2017, 126 : 246 - 255