GRAPHENE: Packing and Dependency-aware Scheduling for Data-Parallel Clusters

被引:0
|
作者
Grandl, Robert [1 ,2 ]
Kandula, Srikanth [1 ]
Rao, Sriram [1 ]
Akella, Aditya [1 ,2 ]
Kulkarni, Janardhan [1 ]
机构
[1] Microsoft, Redmond, WA 98052 USA
[2] Univ Wisconsin Madison, Madison, WI 53706 USA
关键词
D O I
暂无
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We present a new cluster scheduler, GRAPHENE, aimed at jobs that have a complex dependency structure and heterogeneous resource demands. Relaxing either of these challenges, i.e., scheduling a DAG of homogeneous tasks or an independent set of heterogeneous tasks, leads to NP-hard problems. Reasonable heuristics exist for these simpler problems, but they perform poorly when scheduling heterogeneous DAGs. Our key insights are: (1) focus on the long-running tasks and those with tough to -pack resource demands, (2) compute a DAG schedule, offline, by first scheduling such troublesome tasks and then scheduling the remaining tasks without violating dependencies. These offline schedules are distilled to a simple precedence order and are enforced by an online component that scales to many jobs. The online component also uses heuristics to compactly pack tasks and to trade-off fairness for faster job completion. Evaluation on a zoo-server cluster and using traces of production DAGs at Microsoft, shows that GRAPHENE improves median job completion time by 25% and cluster throughput by 30%.
引用
收藏
页码:81 / 97
页数:17
相关论文
共 50 条
  • [1] Dependency-Aware Network Adaptive Scheduling of Data-Intensive Parallel Jobs
    Wang, Shaoqi
    Chen, Wei
    Zhou, Xiaobo
    Zhang, Liqiang
    Wang, Yin
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2019, 30 (03) : 515 - 529
  • [2] LRC: Dependency-Aware Cache Management for Data Analytics Clusters
    Yu, Yinghao
    Wang, Wei
    Zhang, Jun
    Ben Letaief, Khaled
    [J]. IEEE INFOCOM 2017 - IEEE CONFERENCE ON COMPUTER COMMUNICATIONS, 2017,
  • [3] Dependency-Aware Data Locality for MapReduce
    Fan, Xiaoyi
    Ma, Xiaoqiang
    Liu, Jiangchuan
    Li, Dan
    [J]. 2014 IEEE 7TH INTERNATIONAL CONFERENCE ON CLOUD COMPUTING (CLOUD), 2014, : 409 - 416
  • [4] Dependency-Aware Data Locality for MapReduce
    Ma, Xiaoqiang
    Fan, Xiaoyi
    Liu, Jiangchuan
    Li, Dan
    [J]. IEEE TRANSACTIONS ON CLOUD COMPUTING, 2018, 6 (03) : 667 - 679
  • [5] Dependency-Aware Application Assigning and Scheduling in Edge Computing
    Liao, Hanlong
    Li, Xinyi
    Guo, Deke
    Kang, Wenjie
    Li, Jiangfan
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (06) : 4451 - 4463
  • [6] Dependency-Aware Task Scheduling in Vehicular Edge Computing
    Liu, Yujiong
    Wang, Shangguang
    Zhao, Qinglin
    Du, Shiyu
    Zhou, Ao
    Ma, Xiao
    Yang, Fangchun
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2020, 7 (06) : 4961 - 4971
  • [7] Branch Scheduling: DAG-Aware Scheduling for Speeding up Data-Parallel Jobs
    Hu, Zhiyao
    Li, Dongsheng
    Zhang, Yiming
    Guo, Deke
    Li, Ziyang
    [J]. PROCEEDINGS OF THE IEEE/ACM INTERNATIONAL SYMPOSIUM ON QUALITY OF SERVICE (IWQOS 2019), 2019,
  • [8] Symbiosis: Network-Aware Task Scheduling in Data-Parallel Frameworks
    Jiang, Jingjie
    Ma, Shiyao
    Li, Bo
    Li, Baochun
    [J]. IEEE INFOCOM 2016 - THE 35TH ANNUAL IEEE INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATIONS, 2016,
  • [9] Communication-aware scheduling of data-parallel tasks on multicore architectures
    Shimada, Kana
    Taniguchi, Ittetsu
    Tomiyama, Hiroyuki
    [J]. IPSJ Transactions on System LSI Design Methodology, 2019, 12 : 65 - 73
  • [10] A Network-aware Scheduler in Data-parallel Clusters for High Performance
    Li, Zhuozhao
    Shen, Haiying
    Sarker, Ankur
    [J]. 2018 18TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID), 2018, : 1 - 10