FINE: A Fully Informed aNd Efficient communication-induced checkpointing protocol for distributed systems

被引:18
|
作者
Luo, Yi [1 ]
Manivannan, D. [1 ]
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
基金
美国国家科学基金会;
关键词
Distributed systems; Communication-induced checkpointing protocols; Consistent global checkpoints; CONSISTENT GLOBAL CHECKPOINTS; ROLLBACK-RECOVERY; TIME;
D O I
10.1016/j.jpdc.2008.07.012
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Communication-Induced Checkpointing (CIC) protocols are classified into two categories in the literature: Index-based and Model-based. In this paper, we discuss two data structures being used in these two kinds of CIC protocols, and their different roles in helping the checkpointing algorithms to enforce Z-cycle Free (ZCF) property. Then, we present our Fully Informed aNd Efficient (FINE) communication-induced checkpointing algorithm, which not only has less checkpointing overhead than the well-known Fully Informed (FI) CIC protocol proposed by Helary et al. but also has less message overhead. Performance evaluation indicates that our protocol performs better than many of the other existing CIC protocols. (C) 2008 Elsevier Inc. All rights reserved.
引用
收藏
页码:153 / 167
页数:15
相关论文
共 50 条