Persistent fault-tolerance for divide-and-conquer applications on the grid

被引:0
|
作者
Wrzesinska, Gosia [1 ]
Oprescu, Ana-Maria [1 ]
Kielmann, Thilo [1 ]
Bal, Henri [1 ]
机构
[1] Vrije Univ Amsterdam, Amsterdam, Netherlands
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Grid applications need to be fault tolerant, malleable, and migratable. In previous work, we have presented orphan saving, an efficient mechanism addressing these issues for divide-and-conquer applicatioris. In this paper, we present a mechanism for writing partial results to checkpoint files, adding the capability to also tolerate the total 1088 of all processors, and to allow suspending and later resuming an application. Both mechanisms have only negligible overheads in the absence of faults, even with extremely short checkpointing intervals like one minute. In the case of faults, the new checkpointing mechanism outperforms orphan saving by 10 % to 15 %. Also, suspending/resuming an application has only little overhead, making our approach very attractive for writing grid applications.
引用
收藏
页码:425 / +
页数:3
相关论文
共 50 条
  • [21] DIVIDE-AND-CONQUER YOUR DATABASE
    LIVINGSTON, D
    SYSTEMS INTEGRATION BUSINESS, 1991, 24 (05): : 43 - 45
  • [22] A divide-and-conquer discretization algorithm
    Min, F
    Xie, LJ
    Liu, QH
    Cai, HB
    FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY, PT 1, PROCEEDINGS, 2005, 3613 : 1277 - 1286
  • [23] PRUNING DIVIDE-AND-CONQUER NETWORKS
    ROMANIUK, SG
    NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1993, 4 (04) : 481 - 494
  • [24] CUTTING HYPERPLANES FOR DIVIDE-AND-CONQUER
    CHAZELLE, B
    DISCRETE & COMPUTATIONAL GEOMETRY, 1993, 9 (02) : 145 - 158
  • [25] DIVIDE-AND-CONQUER NEURAL NETWORKS
    ROMANIUK, SG
    HALL, LO
    NEURAL NETWORKS, 1993, 6 (08) : 1105 - 1116
  • [26] New and Extended Applications of the Divide-and-Conquer Algorithm for Multibody Dynamics
    Laflin, Jeremy J.
    Anderson, Kurt S.
    Khan, Imad M.
    Poursina, Mohammad
    JOURNAL OF COMPUTATIONAL AND NONLINEAR DYNAMICS, 2014, 9 (04):
  • [27] Applications of the semiempirical divide-and-conquer molecular orbital method.
    Dixon, SL
    Vincent, JJ
    van der Vaart, A
    Merz, KM
    ABSTRACTS OF PAPERS OF THE AMERICAN CHEMICAL SOCIETY, 1998, 215 : U207 - U207
  • [28] Development and tuning of irregular divide-and-conquer applications in DAMPVM/DAC
    Czarnul, P
    RECENT ADVANCES IN PARALLEL VIRTUAL MACHINE AND MESSAGE PASSING INTERFACE, PROCEEDINGS, 2002, 2474 : 208 - 216
  • [29] A Fault Section Location Method for Distribution Networks Based on Divide-and-Conquer
    Zhao, Qiao
    Wang, Zengping
    Li, Guomin
    Liu, Xuanjun
    Wang, Yuxuan
    APPLIED SCIENCES-BASEL, 2023, 13 (10):
  • [30] AVERAGE COMPLEXITY OF DIVIDE-AND-CONQUER ALGORITHMS
    VEROY, BS
    INFORMATION PROCESSING LETTERS, 1988, 29 (06) : 319 - 326