Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling

被引:3
|
作者
Duan, Yubin [1 ]
Wang, Ning [2 ]
Wu, Jie [1 ]
机构
[1] Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA
[2] Rowan Univ, Dept Comp Sci, Glassboro, NJ 08028 USA
基金
美国国家科学基金会;
关键词
data center cluster; directed acyclic graph scheduling; makespan minimization; pipeline; ALGORITHMS;
D O I
10.1007/s11390-021-1488-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The volume of information that needs to be processed in big data clusters increases rapidly nowadays. It is critical to execute the data analysis in a time-efficient manner. However, simply adding more computation resources may not speed up the data analysis significantly. The data analysis jobs usually consist of multiple stages which are organized as a directed acyclic graph (DAG). The precedence relationships between stages cause scheduling challenges. General DAG scheduling is a well-known NP-hard problem. Moreover, we observe that in some parallel computing frameworks such as Spark, the execution of a stage in DAG contains multiple phases that use different resources. We notice that carefully arranging the execution of those resources in pipeline can reduce their idle time and improve the average resource utilization. Therefore, we propose a resource pipeline scheme with the objective of minimizing the job makespan. For perfectly parallel stages, we propose a contention-free scheduler with detailed theoretical analysis. Moreover, we extend the contention-free scheduler for three-phase stages, considering the computation phase of some stages can be partitioned. Additionally, we are aware that job stages in real-world applications are usually not perfectly parallel. We need to frequently adjust the parallelism levels during the DAG execution. Considering reinforcement learning (RL) techniques can adjust the scheduling policy on the fly, we investigate a scheduler based on RL for online arrival jobs. The RL-based scheduler can adjust the resource contention adaptively. We evaluate both contention-free and RL-based schedulers on a Spark cluster. In the evaluation, a real-world cluster trace dataset is used to simulate different DAG styles. Evaluation results show that our pipelined scheme can significantly improve CPU and network utilization.
引用
收藏
页码:852 / 868
页数:17
相关论文
共 14 条
  • [1] Accelerating DAG-Style Job Execution via Optimizing Resource Pipeline Scheduling
    Yubin Duan
    Ning Wang
    Jie Wu
    Journal of Computer Science and Technology, 2022, 37 : 852 - 868
  • [2] Stage Delay Scheduling: Speeding up DAG-style Data Analytics Jobs with Resource Interleaving
    Shao, Wujie
    Xu, Fei
    Chen, Li
    Zheng, Haoyue
    Liu, Fangming
    PROCEEDINGS OF THE 48TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP 2019), 2019,
  • [3] Reducing Average Job Completion Time for DAG-style Jobs by Adding Idle Slots
    Duan, Yubin
    Wu, Jie
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4504 - 4509
  • [4] Optimizing schedules in the pipeline systems with variable job execution order
    Levin, VI
    AUTOMATION AND REMOTE CONTROL, 2005, 66 (03) : 406 - 421
  • [5] Optimizing schedules in the pipeline systems with variable job execution order
    V. I. Levin
    Automation and Remote Control, 2005, 66 : 406 - 421
  • [6] Adaptive job scheduling via predictive job resource allocation
    Barsanti, Lawrence
    Sodan, Angela C.
    JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, 2007, 4376 : 115 - +
  • [7] Optimizing job scheduling by using broad learning to predict execution times on HPC clusters
    Hou, Zhengxiong
    Shen, Hong
    Feng, Qiying
    Lv, Zhiqi
    Jin, Junwei
    Zhou, Xingshe
    Gu, Jianhua
    CCF TRANSACTIONS ON HIGH PERFORMANCE COMPUTING, 2024, 6 (04) : 365 - 377
  • [8] Flexible Job Shop Scheduling Method for Optimizing Mold Resource Setup Time
    Cheng, Yi
    Xie, Zhijun
    Xin, Yu
    Chen, Kewei
    Zarei, Roozbeh
    IEEE ACCESS, 2024, 12 : 33486 - 33503
  • [9] Job-shop resource scheduling via simulating random operations
    Ben-Gurion Univ of the Negev, Beer Sheva, Israel
    Math Comput Simul, 5 (427-440):
  • [10] Job-shop resource scheduling via simulating random operations
    Golenko-Ginzburg, D
    Gonik, A
    MATHEMATICS AND COMPUTERS IN SIMULATION, 1997, 44 (05) : 427 - 440