DEE: A distributed fault tolerant workflow enactment engine for Grid computing

被引:0
|
作者
Duan, RB [1 ]
Prodan, R [1 ]
Fahringer, T [1 ]
机构
[1] Univ Innsbruck, Inst Comp Sci, A-6020 Innsbruck, Austria
关键词
Grid computing; checkpointing; dependence analysis; distributed enactment engine; fault tolerance; overhead analysis;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
It is a complex task to design and implement a workflow management system that supports scalable executions of large-scale scientific workflows for dynamic and heterogeneous Grid environments. In this paper we describe the Distributed workflow Enactment Engine (DEE) of the ASKALON Grid application development environment for Grid computing. DEE proposes a de-centralized architecture that simplifies and reduces the overhead for managing large workflows through partitioning, improved data locality, and reduced workflow-level check-pointing overhead. We report experimental results for a real-world material science workflow application.
引用
收藏
页码:704 / 716
页数:13
相关论文
共 50 条
  • [1] An algorithm for online distributed fault-tolerant job scheduling in grid computing
    Zeng, Jun
    [J]. INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2021, 17 (04) : 389 - 407
  • [2] A distributed re-configurable grid workflow engine
    Cao, Jian
    Li, Minglu
    Wei, Wei
    Zhang, Shensheng
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 3, PROCEEDINGS, 2006, 3993 : 948 - 955
  • [3] A THEORETICIANS VIEW OF FAULT TOLERANT DISTRIBUTED COMPUTING
    FISCHER, MJ
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1990, 448 : 1 - 9
  • [4] BIBLIOGRAPHY FOR FAULT-TOLERANT DISTRIBUTED COMPUTING
    COAN, BA
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1990, 448 : 274 - 298
  • [5] Reliable fault tolerant model for grid computing environments
    Rebbah, Mohammed
    Slimani, Yahya
    Benyettou, Abdelkader
    Brunie, Lionel
    [J]. MULTIAGENT AND GRID SYSTEMS, 2014, 10 (04) : 213 - 232
  • [6] Fault Tolerant Application Execution Model in Computing Grid
    Singh, Major
    Kaur, Lakhwinder
    [J]. 2010 IEEE 2ND INTERNATIONAL ADVANCE COMPUTING CONFERENCE, 2010, : 290 - +
  • [7] Fault tolerant dynamic distributed computational grid systems
    Mitchell, LJ
    Reedy, D
    [J]. 7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING, 2003, : 101 - 106
  • [8] Fault-tolerant distributed computing: Evolution and issues
    Kim, K.H.
    [J]. IEEE Distributed Systems Online, 2002, 3 (07):
  • [10] Optimal recovery schemes in fault tolerant distributed computing
    Klonowska, K
    Lennerstad, H
    Lundberg, L
    Svahnberg, C
    [J]. ACTA INFORMATICA, 2005, 41 (06) : 341 - 365