EFFICIENT CHECKPOINTING PROCEDURES FOR FAULT-TOLERANT DISTRIBUTED SYSTEMS

被引:0
|
作者
SALEH, K
AGARWAL, A
机构
[1] KUWAIT UNIV,DEPT ELECT & COMP ENGN,SAFAT 13060,KUWAIT
[2] CONCORDIA UNIV,DEPT ELECT & COMP ENGN,MONTREAL H3G 1M8,QUEBEC,CANADA
[3] UNIV ROORKEE,DEPT ELECTR & COMP ENGN,ROORKEE 247667,UTTAR PRADESH,INDIA
来源
MICROPROCESSING AND MICROPROGRAMMING | 1994年 / 40卷 / 06期
关键词
CHECKPOINTING; DISTRIBUTED SYSTEMS; FAULT TOLERANCE; ROLLBACK RECOVERY; SYSTEM STATE;
D O I
10.1016/0165-6074(94)90107-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
A classical approach for achieving fault tolerance in distributed systems is based on the incorporation of efficient and fault tolerant procedures for checkpointing and recovery in such systems. We propose two checkpointing procedures, which can be initiated by any process in the system or upon failure of one or more component processes. Our procedures return the most recent and consistent checkpoints for the processes initiating the procedure, and do not interfere with the progress of the distributed system application. Furthermore, our procedures guarantee that a consistent checkpoint will be obtained when they terminate. Examples illustrating the application of the procedures are also provided.
引用
收藏
页码:427 / 438
页数:12
相关论文
共 50 条
  • [21] A Novel Fault-Tolerant Scheme for Distributed Systems
    Zhang, Xiaoqin
    Wei, Zhidong
    Zhang, Fenggui
    Liu, Guoliang
    CEIS 2011, 2011, 15
  • [22] Distributed Voting for Fault-Tolerant Nanoscale Systems
    Namazi, Ali
    Nourani, Mehrdad
    2007 IEEE INTERNATIONAL CONFERENCE ON COMPUTER DESIGN, VOLS, 1 AND 2, 2007, : 569 - 574
  • [23] Secure and fault-tolerant voting in distributed systems
    Hardekopf, B
    Kwiat, K
    Upadhyaya, S
    2001 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOLS 1-7, 2001, : 1117 - 1126
  • [25] DETECTING UNREALIZABILITY OF DISTRIBUTED FAULT-TOLERANT SYSTEMS
    Finkbeiner, Bernd
    Tentrup, Leander
    LOGICAL METHODS IN COMPUTER SCIENCE, 2015, 11 (03)
  • [27] FAULT-TOLERANT LOOPS FOR DISTRIBUTED MEASUREMENT SYSTEMS
    GATER, C
    MACKIE, RDL
    JORDAN, JR
    IEE PROCEEDINGS-E COMPUTERS AND DIGITAL TECHNIQUES, 1989, 136 (06): : 485 - 489
  • [28] Evaluation of fault-tolerant distributed web systems
    Hong, YS
    No, JH
    Han, I
    WORDS 2005: 10th IEEE International Workshop on Object-Oriented Real-Time Dependable, Proceedings, 2005, : 148 - 151
  • [29] DISTRIBUTED FAULT-TOLERANT COMPUTER-SYSTEMS
    RENNELS, DA
    COMPUTER, 1980, 13 (03) : 55 - 65
  • [30] COMMUNICATION STRUCTURES IN FAULT-TOLERANT DISTRIBUTED SYSTEMS
    PRADHAN, DK
    MEYER, FJ
    NETWORKS, 1993, 23 (04) : 379 - 389