Storage-aware Algorithms for Scheduling of Workflow Ensembles in Clouds

被引:28
|
作者
Bryk, Piotr [1 ]
Malawski, Maciej [2 ]
Juve, Gideon [3 ]
Deelman, Ewa [3 ]
机构
[1] Google Poland, Warsaw Financial Ctr, Emilii Plater 53, Warsaw, Poland
[2] AGH Univ Sci & Technol, Dept Comp Sci, Al Mickiewicza 30, PL-30059 Krakow, Poland
[3] USC Informat Sci Inst, 4676 Admiralty Way, Marina Del Rey, CA USA
基金
美国国家科学基金会;
关键词
Workflow ensembles; Scheduling algorithms; Cloud computing; Cloud storage; SCIENTIFIC WORKFLOWS; SYSTEMS;
D O I
10.1007/s10723-015-9355-6
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper focuses on data-intensive workflows and addresses the problem of scheduling workflow ensembles under cost and deadline constraints in Infrastructure as a Service (IaaS) clouds. Previous research in this area ignores file transfers between workflow tasks, which, as we show, often have a large impact on workflow ensemble execution. In this paper we propose and implement a simulation model for handling file transfers between tasks, featuring the ability to dynamically calculate bandwidth and supporting a configurable number of replicas, thus allowing us to simulate various levels of congestion. The resulting model is capable of representing a wide range of storage systems available on clouds: from in-memory caches (such as memcached), to distributed file systems (such as NFS servers) and cloud storage (such as Amazon S3 or Google Cloud Storage). We observe that file transfers may have a significant impact on ensemble execution; for some applications up to 90 % of the execution time is spent on file transfers. Next, we propose and evaluate a novel scheduling algorithm that minimizes the number of transfers by taking advantage of data caching and file locality. We find that for data-intensive applications it performs better than other scheduling algorithms. Additionally, we modify the original scheduling algorithms to effectively operate in environments where file transfers take non-zero time.
引用
收藏
页码:359 / 378
页数:20
相关论文
共 50 条
  • [1] Storage-aware Algorithms for Scheduling of Workflow Ensembles in Clouds
    Piotr Bryk
    Maciej Malawski
    Gideon Juve
    Ewa Deelman
    Journal of Grid Computing, 2016, 14 : 359 - 378
  • [2] A storage-aware scheduling scheme for VOD
    Peng, C
    Shen, H
    GRID AND COOPERATIVE COMPUTING GCC 2004, PROCEEDINGS, 2004, 3251 : 803 - 806
  • [3] Storage-aware task scheduling with reliable resource selection
    Niyoyita, Jean Paul
    Dong, Shoubin
    Journal of Computational Information Systems, 2015, 11 (01): : 123 - 131
  • [4] A Sequential Cooperative Game Theoretic Approach to Storage-aware Scheduling of Multiple Large-scale Workflow Applications in Grids
    Duan, Rubing
    Prodan, Radu
    Li, Xiaorong
    2012 ACM/IEEE 13TH INTERNATIONAL CONFERENCE ON GRID COMPUTING (GRID), 2012, : 31 - 39
  • [5] Storage-aware Task Scheduling for Performance Optimization of Big Data Workflows
    Ye, Qianwen
    Wu, Chase Q.
    Cao, Huiyan
    Rao, Nageswara S. V.
    Hou, Aiqin
    2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 1095 - 1102
  • [6] Privacy-aware and cost-aware workflow scheduling in clouds
    Wen Y.
    Liu J.
    Chen C.
    Jisuanji Jicheng Zhizao Xitong/Computer Integrated Manufacturing Systems, CIMS, 2016, 22 (02): : 294 - 301
  • [7] Storage-Aware Value Prediction
    Salehi, Mohammad
    Baniasadi, Amirali
    13TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN: ARCHITECTURES, METHODS AND TOOLS, 2010, : 722 - 728
  • [8] Storage-aware Joint User Scheduling and Spectrum Allocation for Federated Learning
    Shen, Yineng
    Yuan, Jiantao
    Chen, Xianfu
    Wu, Celimuge
    Yin, Rui
    2022 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM 2022), 2022, : 4716 - 4721
  • [9] Application and Storage-Aware Data Placement and Job Scheduling for Hadoop Clusters
    Li, Tao
    He, Shuibing
    Chen, Ping
    Yang, Siling
    Yin, Yanlong
    Xu, Cheng
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2020, 29 (16)
  • [10] Effective Algorithms for Scheduling Workflow Tasks on Mobile Clouds
    Li, Heng
    Zhu, Yaoqin
    Zhou, Meng
    Dong, Yun
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2020, 29 (16)