FINE: A Fully Informed aNd Efficient communication-induced checkpointing protocol for distributed systems

被引:18
|
作者
Luo, Yi [1 ]
Manivannan, D. [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
基金
美国国家科学基金会;
关键词
Distributed systems; Communication-induced checkpointing protocols; Consistent global checkpoints; CONSISTENT GLOBAL CHECKPOINTS; ROLLBACK-RECOVERY; TIME;
D O I
10.1016/j.jpdc.2008.07.012
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Communication-Induced Checkpointing (CIC) protocols are classified into two categories in the literature: Index-based and Model-based. In this paper, we discuss two data structures being used in these two kinds of CIC protocols, and their different roles in helping the checkpointing algorithms to enforce Z-cycle Free (ZCF) property. Then, we present our Fully Informed aNd Efficient (FINE) communication-induced checkpointing algorithm, which not only has less checkpointing overhead than the well-known Fully Informed (FI) CIC protocol proposed by Helary et al. but also has less message overhead. Performance evaluation indicates that our protocol performs better than many of the other existing CIC protocols. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:153 / 167
页数:15
相关论文
共 50 条
  • [31] Low overhead communication-induced checkpointing protocols ensuring rollback-dependency trackability property
    Abdelhafidi, Z.
    Lagraa, N.
    Yagoubi, M. B.
    Djoudi, M.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (21):
  • [32] Theoretical and experimental evaluation of communication-induced checkpointing protocols in FE and FLazy-E families
    Luo, Yi
    Manivannan, D.
    [J]. PERFORMANCE EVALUATION, 2011, 68 (05) : 429 - 445
  • [33] Impossibility of scalar clock-based communication-induced checkpointing protocols ensuring the RDT property
    Baldoni, R
    Hélary, JM
    Mostéfaoui, A
    Raynal, M
    [J]. INFORMATION PROCESSING LETTERS, 2001, 80 (02) : 105 - 111
  • [34] Soft-Checkpointing Based Hybrid Synchronous Checkpointing Protocol for Mobile Distributed Systems
    Kumar, Parveen
    Garg, Rachit
    [J]. INTERNATIONAL JOURNAL OF DISTRIBUTED SYSTEMS AND TECHNOLOGIES, 2011, 2 (01) : 1 - 13
  • [35] Efficient recovery approach in distributed systems with hybrid checkpointing
    Jiang, YX
    Gupta, B
    [J]. COMPUTERS AND THEIR APPLICATIONS, 2000, : 292 - 297
  • [36] Efficient techniques for adaptive independent checkpointing in distributed systems
    Lin, CM
    Dow, CR
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2000, E83D (08): : 1642 - 1653
  • [37] An efficient and scalable checkpointing and recovery algorithm for distributed systems
    Kumar, K. P. Krishna
    Hansdah, R. C.
    [J]. DISTRIBUTED COMPUTING AND NETWORKING, PROCEEDINGS, 2006, 4308 : 94 - 99
  • [38] EFFICIENT DECENTRALIZED CHECKPOINTING IN DISTRIBUTED DATABASE-SYSTEMS
    SON, SH
    [J]. PROCEEDINGS OF THE TWENTY-FIRST, ANNUAL HAWAII INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES, VOLS 1-4: ARCHITECTURE TRACK, SOFTWARE TRACK, DECISION SUPPORT AND KNOWLEDGE BASED SYSTEMS TRACK, APPLICATIONS TRACK, 1988, : B554 - B560
  • [39] A causal message logging protocol with asynchronous checkpointing for distributed systems
    Ahn, J
    Kim, K
    Hwang, C
    [J]. PARALLEL AND DISTRIBUTED COMPUTING SYSTEMS, 2000, : 523 - 528
  • [40] Design and analysis of an efficient algorithm for coordinated checkpointing in distributed systems
    Cao, JN
    Jia, WJ
    Jia, XH
    Cheung, TY
    [J]. ADVANCES IN PARALLEL AND DISTRIBUTED COMPUTING - PROCEEDINGS, 1997, : 261 - 268