A Data-aware Partitioning and Optimization Method for Large-scale Workflows in Hybrid Computing Environments

被引:0
|
作者
Duan, Rubing [1 ]
Li, Xiaorong [1 ]
机构
[1] ASTAR, Inst High Performance Comp, Singapore, Singapore
关键词
D O I
10.1109/ICPADS.2013.29
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While hybrid computing environments provide good potential for achieving high performance and low economic cost, it also introduces a broad set of unpredictable overheads especially for running data-intensive applications. This paper describes a novel approach which refines workflow structures and optimizes intermediate data transfers for large-scale scientific workflows containing thousands (or even millions) of tasks. The proposed method includes pre- and post-partitioning of workflows and data-flow optimization. Firstly, it partitions a workflow by identifying the critical path of the task graph. Secondly, it controls the granularity of partitions to reduce the complexity of task graph in order to process large-scale workflows. Thirdly, it optimizes the data-flow based on the scheduling to minimize its communication overheads. Our proposed approach is able to handle complex data flows and significantly reduce data transfer by replacing individual tasks according to data dependencies. We conducted experiments using real applications such as Montage and Broadband, and the results demonstrated the effectiveness of our methods in achieving low execution time with low communication overhead in a hybrid computing environments.
引用
收藏
页码:126 / 133
页数:8
相关论文
共 50 条
  • [1] A Data-Aware Scheduling Strategy for Executing Large-Scale Distributed Workflows
    Giampa, Salvatore
    Belcastro, Loris
    Marozzo, Fabrizio
    Talia, Domenico
    Trunfio, Paolo
    [J]. IEEE ACCESS, 2021, 9 : 47354 - 47364
  • [2] Data-aware optimization of bioinformatics workflows in hybrid clouds
    Kintsakis A.M.
    Psomopoulos F.E.
    Mitkas P.A.
    [J]. Kintsakis, Athanassios M. (akintsakis@issel.ee.auth.gr), 2016, SpringerOpen (03)
  • [3] Data-Aware Scheduling of Scientific Workflows in Hybrid Clouds
    Pasdar, Amirmohammad
    Almi'ani, Khaled
    Lee, Young Choon
    [J]. COMPUTATIONAL SCIENCE - ICCS 2018, PT III, 2018, 10862 : 708 - 714
  • [4] Direction-aware resource discovery in large-scale distributed computing environments
    Chung, Wu-Chun
    Hsu, Chin-Jung
    Lai, Kuan-Chou
    Li, Kuan-Ching
    Chung, Yeh-Ching
    [J]. JOURNAL OF SUPERCOMPUTING, 2013, 66 (01): : 229 - 248
  • [5] Direction-aware resource discovery in large-scale distributed computing environments
    Wu-Chun Chung
    Chin-Jung Hsu
    Kuan-Chou Lai
    Kuan-Ching Li
    Yeh-Ching Chung
    [J]. The Journal of Supercomputing, 2013, 66 : 229 - 248
  • [6] Environments for large-scale optimization
    Bouaricha, A
    More, JJ
    [J]. ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1996, 76 : 37 - 39
  • [7] A Context-aware Service Framework for Large-Scale Ambient Computing Environments
    Satoh, Ichiro
    [J]. INTERNATIONAL CONFERENCE ON PERVASIVE SERVICES (ICPS 2009), 2009, : 199 - 207
  • [8] Computing the reliability of large-scale power grids by partitioning
    Sun, Wen
    Fang, Hualiang
    Yu, Jinhe
    An, Lingxu
    [J]. Fang, H. (hlfang@whu.edu.cn), 1600, Huazhong University of Science and Technology (41): : 88 - 92
  • [9] Computing the Schulze Method for Large-Scale Preference Data Sets
    Csar, Theresa
    Lackner, Martin
    Pichler, Reinhard
    [J]. PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2018, : 180 - 187
  • [10] Autonomous Resource-Aware Scheduling of Large-Scale Media Workflows
    Desmet, Stein
    Volckaert, Bruno
    De Turck, Filip
    [J]. MECHANISMS FOR AUTONOMOUS MANAGEMENT OF NETWORKS AND SERVICES, 2010, 6155 : 50 - 64