Scheduling large-scale scientific workflow on virtual machines with different numbers of vCPUs

被引:0
|
作者
Hao Wu
Xin Chen
Xiaoyu Song
Chi Zhang
He Guo
机构
[1] Dalian University of Technology,The School of Software Technology
[2] Liaoning University of Technology,The School of Electronics and Information Engineering
[3] Portland State University,The ECE Department
来源
关键词
Cloud computing; Scientific workflow; DAG splitting; Scheduling; Cost minimization;
D O I
暂无
中图分类号
学科分类号
摘要
With the wide deployment of cloud computing in scientific computing, cost minimization is increasingly critical for large-scale scientific workflow. Unfortunately, due to the highly intricate directed acyclic graph (DAG)-based workflow and the flexible usage of virtual machines (VMs) in cloud platform, the existing workflow scheduling approaches are inefficient to strike a balance between the parallelism and the topology of the DAG-based workflow while using the VMs, which causes a low utilization of VMs and consumes more cost. To address these issues, this paper presents a novel task scheduling framework named cost minimization approach with the DAG splitting method (COMSE) for minimizing the cost of running a deadline-constrained large-scale scientific workflow. First, we provide comprehensive theoretical analyses on how to improve the utilization of a resource-balanced multi-vCPU VM for running multiple tasks simultaneously. Second, considering the balance between the parallelism and the topology of a workflow, we simplify the DAG-based workflow, and based on the simplified DAG, a DAG splitting method is devised to preprocess the workflow. Third, since the cloud is charged by hours, we also design an exact algorithm to find the optimal operation pattern for a given schedule to make the consumed instance hours minimum, and this algorithm is named as instance hours minimization by Dijkstra (TOID). Finally, by employing the DAG splitting method and the TOID, the COMSE schedules a deadline-constrained large-scale scientific workflow on the multi-vCPU VMs and incorporates two important objects: minimizing the computation cost and the communication cost. Our solution approach is evaluated through rigorous performance evaluation study using real-word workflows, and the results show that the proposed COMSE approach outperforms existing algorithms in terms of computation cost and communication cost.
引用
收藏
页码:679 / 710
页数:31
相关论文
共 50 条
  • [21] The Large-scale Structure of Scientific Method
    Kosso, Peter
    [J]. SCIENCE & EDUCATION, 2009, 18 (01) : 33 - 42
  • [22] The Large-scale Structure of Scientific Method
    Peter Kosso
    [J]. Science & Education, 2009, 18 : 33 - 42
  • [23] Real or virtual large-scale structure?
    Evrard, AE
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1999, 96 (08) : 4228 - 4231
  • [24] A Scientific Workflow Management System for orchestration of parallel components in a cloud of large-scale parallel processing services
    Silva, Jefferson de Carvalho
    de Oliveira Dantas, Allberson Bruno
    de Carvalho Junior, Francisco Heron
    [J]. SCIENCE OF COMPUTER PROGRAMMING, 2019, 173 : 95 - 127
  • [25] Farm machines for large-scale conservation agriculture
    Saxton, KE
    Morrison, JE
    [J]. CONSERVATION AGRICULTURE: ENVIRONMENT, FARMERS EXPERIENCES, INNOVATIONS, SOCIO-ECONOMY, POLICY, 2003, : 255 - 262
  • [26] Towards an automated workflow for large-scale housing retrofit
    Tan, Ling Min
    Arbabi, Hadi
    Ward, Wil
    Li, Xinyi
    Tingley, Danielle Densley
    Khan, Ahsan
    Mayfield, Martin
    [J]. ENVIRONMENTAL RESEARCH LETTERS, 2023, 18 (06)
  • [27] Large-scale surgical workflow segmentation for laparoscopic sacrocolpopexy
    Yitong Zhang
    Sophia Bano
    Ann-Sophie Page
    Jan Deprest
    Danail Stoyanov
    Francisco Vasconcelos
    [J]. International Journal of Computer Assisted Radiology and Surgery, 2022, 17 : 467 - 477
  • [28] THE PRINCIPLES OF LARGE-SCALE COMPUTING MACHINES - FOREWORD
    STERN, N
    [J]. ANNALS OF THE HISTORY OF COMPUTING, 1989, 10 (04): : 245 - 246
  • [29] Distributed workflow management for large-scale grid environments
    Schneider, J
    Linnert, B
    Burchard, LO
    [J]. INTERNATIONAL SYMPOSIUM ON APPLICATIONS AND THE INTERNET , PROCEEDINGS, 2006, : 229 - +
  • [30] Fast Prediction for Large-Scale Kernel Machines
    Hsieh, Cho-Jui
    Si, Si
    Dhillon, Inderjit S.
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 27 (NIPS 2014), 2014, 27