A formal model for fault-tolerance in distributed systems

被引:0
|
作者
Hamid, B [1 ]
Mosbah, M [1 ]
机构
[1] Univ Bordeaux 1, LaBRI, ENSEIRB, F-33405 Talence, France
关键词
distributed systems; fault-tolerance; graph rewriting systems; local computations;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a formal method based on graph rewriting systems for the specifications and the proofs of fault-tolerant distributed algorithms. Our method deals with crash failures. In a crash failure system the process can fail by crashing, i.e. by permanently halting. The faulty processes are the processes contaminated by the crashes. The methodology is formalized in two phases. In the first phase, we build the set of illegitimate configurations to specify the faults and the faulty processes. The second phase is devoted to the addition of correction rules in the initial graph rewriting system used to encode the distributed algorithm. These rules are able to detect and eliminate the faults locally during the computation. This method can be implemented under an asynchronous message passing system which notifies the faults. To illustrate this approach, we present examples of fault-tolerant distributed spanning tree algorithms.
引用
收藏
页码:108 / 121
页数:14
相关论文
共 50 条
  • [1] MODELING OF HIERARCHICAL DISTRIBUTED SYSTEMS WITH FAULT-TOLERANCE
    SHIEH, YB
    GHOSAL, D
    CHINTAMANENI, PR
    TRIPATHI, SK
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1990, 16 (04) : 444 - 457
  • [2] A Fault-tolerance Framework for Distributed Component Systems
    Hamid, Brahim
    Radermacher, Ansgar
    Vanuxeem, Patrick
    Lanusse, Agnes
    Gerard, Sebastien
    [J]. PROCEEDINGS OF THE 34TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS, 2008, : 84 - 91
  • [3] Automated analysis of fault-tolerance in distributed systems
    Stoller, SD
    Schneider, FB
    [J]. FORMAL METHODS IN SYSTEM DESIGN, 2005, 26 (02) : 183 - 196
  • [4] Automated Analysis of Fault-Tolerance in Distributed Systems
    Scott D. Stoller
    Fred B. Schneider
    [J]. Formal Methods in System Design, 2005, 26 : 183 - 196
  • [5] Fault-tolerance model of the information systems
    Potapov, V., I
    Goleva, A., I
    Storozhenko, N. R.
    Shafeeva, O. P.
    Pastuhova, E., I
    Chervenchuk, I., V
    [J]. MECHANICAL SCIENCE AND TECHNOLOGY UPDATE (MSTU 2019), 2019, 1260
  • [6] ON FAULT-TOLERANCE MECHANISMS IN DISTRIBUTED COMPUTER SYSTEMS.
    Eberbach, Eugeniusz
    Just, Jan R.
    [J]. 1600, (16): : 4 - 5
  • [7] ON FAULT-TOLERANCE MECHANISMS IN DISTRIBUTED COMPUTER-SYSTEMS
    EBERBACH, E
    JUST, JR
    [J]. MICROPROCESSING AND MICROPROGRAMMING, 1985, 16 (4-5): : 239 - 244
  • [8] AN EFFICIENT RECOVERY PROCEDURE FOR FAULT-TOLERANCE IN DISTRIBUTED SYSTEMS
    SALEH, K
    AHMAD, I
    ALSAQABI, K
    AGARWAL, A
    [J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1994, 25 (01) : 39 - 50
  • [9] Fault-tolerance in distributed real-time systems
    Jahanian, F
    [J]. THIRD INTERNATIONAL WORKSHOP ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 1996, : 178 - 178
  • [10] A new algorithm for increasing fault-tolerance of distributed systems
    Dishabi, Mohammad Reza Ebrahimi
    Sharifi, Mohsen
    [J]. PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORKS, 2007, : 96 - +