A Data and Task Co-Scheduling Algorithm for Scientific Cloud Workflows

被引:14
|
作者
Deng, Kefeng [1 ]
Ren, Kaijun [1 ]
Zhu, Min [1 ]
Song, Junqiang [1 ]
机构
[1] Natl Univ Def Technol, Sch Comp, Changsha 410073, Hunan, Peoples R China
基金
中国国家自然科学基金;
关键词
Cloud computing; scientific workflow; co-scheduling; data placement; task scheduling; DATA PLACEMENT; STRATEGY;
D O I
10.1109/TCC.2015.2511745
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cloud computing has emerged as a promising computational infrastructure for cost-efficient workflow execution by provisioning on-demand resources in a pay-as-you-go manner. While scientific workflows require accessing community-wide resources, they usually need to be performed in collaborative cloud environments composed of multiple datacenters. Although such environments facilitate scientific collaboration, the movements of input and intermediate datasets across geographically distributed datacenters may cause intolerable latency that would hinder efficient execution of large-scale data-intensive scientific workflows. To address the problem, in this article we propose a novel multi-level K-cut graph partitioning algorithm to minimize the volume of data transfer across datacenters while satisfying load balancing and fixed data constraints. The algorithm first contracts the fixed input datasets in the same datacenter and their consuming tasks, and coarsens the contracted graph to a predefined scale in a level-by-level manner. Then, a K-cut algorithm is used to partition the resulted graph into K parts such that the cut size is minimized. After that, the partitioned graph is projected back to the original workflow graph, during which the load balancing constraint is maintained. We evaluate our algorithm using three real-world workflow applications and the results demonstrate that the proposed algorithm outperforms other state-of-the-art algorithms.
引用
收藏
页码:349 / 362
页数:14
相关论文
共 50 条
  • [1] Co-Scheduling Scientific Workflows in Elastic Optical Networks
    Joseph, Anisha
    Plante, Jeremy
    Zhao, Juzi
    Vokkarane, Vinod M.
    [J]. 2018 IEEE 39TH SARNOFF SYMPOSIUM, 2018,
  • [2] Task scheduling strategy based on data replication in scientific Cloud workflows
    Djebbar, Esma Insaf
    Belalem, Ghalem
    Benadda, Merien
    [J]. MULTIAGENT AND GRID SYSTEMS, 2016, 12 (01) : 55 - 67
  • [3] Co-scheduling Ensembles of In Situ Workflows
    Tu Mai Anh Do
    Pottier, Loic
    da Silva, Rafael Ferreira
    Suter, Frederic
    Caino-Lores, Silvina
    Taufer, Michela
    Deelman, Ewa
    [J]. 2022 IEEE/ACM WORKSHOP ON WORKFLOWS IN SUPPORT OF LARGE-SCALE SCIENCE, WORKS, 2022, : 43 - 51
  • [4] A Hybrid Algorithm for Scheduling Scientific Workflows in Cloud Computing
    Sardaraz, Muhammad
    Tahir, Muhammad
    [J]. IEEE ACCESS, 2019, 7 : 186137 - 186146
  • [5] Efficient Data and Task Co-Scheduling for Scientific Workflow in Geo-distributed Datacenters
    Chen, Jian
    Zhang, Jinghui
    Song, Aibo
    [J]. 2017 FIFTH INTERNATIONAL CONFERENCE ON ADVANCED CLOUD AND BIG DATA (CBD), 2017, : 63 - 68
  • [6] A heuristic-based task scheduling algorithm for scientific workflows in heterogeneous cloud computing platforms
    NoorianTalouki, Reza
    Shirvani, Mirsaeid Hosseini
    Motameni, Homayun
    [J]. JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (08) : 4902 - 4913
  • [7] A Budget-Aware algorithm for Scheduling Scientific Workflows in Cloud
    Arabnejad, Vahid
    Bubendorfer, Kris
    Ng, Bryan
    [J]. PROCEEDINGS OF 2016 IEEE 18TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS; IEEE 14TH INTERNATIONAL CONFERENCE ON SMART CITY; IEEE 2ND INTERNATIONAL CONFERENCE ON DATA SCIENCE AND SYSTEMS (HPCC/SMARTCITY/DSS), 2016, : 1188 - 1195
  • [8] A hybrid evolutionary algorithm for task scheduling and data assignment of data-intensive scientific workflows on clouds
    Teylo, Luan
    de Paula, Ubiratam
    Frota, Yuri
    de Oliveira, Daniel
    Drummond, Lucia M. A.
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2017, 76 : 1 - 17
  • [9] VM Co-scheduling: Approximation of Optimal Co-Scheduling in Data Center
    Yan, Wei
    Zhou, Li
    Lin, Chuang
    [J]. 25TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS (AINA 2011), 2011, : 340 - 347
  • [10] Scheduling Architectures for Scientific Workflows in the Cloud
    Erbel, Johannes
    Korte, Fabian
    Grabowski, Jens
    [J]. SYSTEM ANALYSIS AND MODELING: LANGUAGES, METHODS, AND TOOLS FOR SYSTEMS ENGINEERING, SAM 2018, 2018, 11150 : 20 - 28