SYNTHESIS OF ALGORITHM-BASED FAULT-TOLERANT SYSTEMS FOR DEPENDENCE GRAPHS

被引:11
|
作者
VINNAKOTA, B [1 ]
JHA, NK [1 ]
机构
[1] PRINCETON UNIV,DEPT ELECT ENGN,PRINCETON,NJ 08544
关键词
ALGORITHM-BASED FAULT TOLERANCE; CHECKSUM ENCODING; CONCURRENT ERROR DETECTION; DEPENDENCE GRAPHS; FAULT DETECTABILITY; FAULT LOCATABILITY; SYSTEM SYNTHESIS FOR FAULT TOLERANCE;
D O I
10.1109/71.238622
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Algorithm-Based Fault Tolerance (ABFT) is a scheme to improve the reliability of parallel architectures used for computation-intensive tasks. The exact implementation of an ABFT scheme is algorithm-dependent. ABFT systems have very low overhead compared to other fault tolerance schemes with similar benefits. Few results are available in the area of general synthesis of ABFT systems. A two-stage approach to the synthesis of ABFT systems is proposed. In the first stage a system-level code is chosen to encode the data used in the algorithm. In the second stage the optimal architecture to implement the scheme is chosen using dependence graphs. Dependence graphs are a graph-theoretic form of algorithm representation. We demonstrate that not all architectures are ideal for the implementation of a particular ABFT scheme. We propose new measures to characterize the fault tolerance capability of a system to better exploit the proposed synthesis method. Dependence graphs can also be used for the synthesis of ABFT schemes for non-linear problems. An example of a fault-tolerant median filter is provided to illustrate their utility for such problems.
引用
下载
收藏
页码:864 / 874
页数:11
相关论文
共 50 条
  • [31] Fault-tolerant file transmission by information dispersal algorithm in rotator graphs
    Hamada, Y
    Bao, F
    Mei, A
    Igarashi, Y
    PROCEEDINGS OF THE 16TH INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS, 1996, : 19 - 25
  • [32] OPTIMAL-DESIGN OF FAULT-TOLERANT DISTRIBUTED SYSTEMS BASED ON A RECURSIVE ALGORITHM
    PHAM, H
    UPADHYAYA, SJ
    IEEE TRANSACTIONS ON RELIABILITY, 1991, 40 (03) : 375 - 379
  • [33] Fault-tolerant FPGA-based systems
    Elshafey, K
    Hlavicka, J
    COMPUTING AND INFORMATICS, 2002, 21 (05) : 489 - 505
  • [34] FAULT-TOLERANT MICROPROCESSOR-BASED SYSTEMS
    JOHNSON, BW
    IEEE MICRO, 1984, 4 (06) : 6 - 21
  • [35] Synthesis of fault-tolerant embedded systems with checkpointing and replication
    Izosimov, V
    Pop, P
    Eles, P
    Peng, Z
    DELTA 2006: THIRD IEEE INTERNATIONAL WORKSHOP ON ELECTRONIC DESIGN, TEST AND APPLICATIONS, 2006, : 440 - +
  • [36] Adaptive Fault-Tolerant Consensus Protocols for Multiagent Systems With Directed Graphs
    Wang, Zhanshan
    Wu, Yanming
    Liu, Lei
    Zhang, Huaguang
    IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (01) : 25 - 35
  • [37] Improved Fault-Tolerant Consensus Based on the PBFT Algorithm
    Yang, Jian
    Jia, Zhenhong
    Su, Ruiguo
    Wu, Xiaoxiong
    Qin, Jiwei
    IEEE ACCESS, 2022, 10 : 30274 - 30283
  • [38] Fault-tolerant elastic scheduling algorithm for workflow in Cloud systems
    Ding, Yongsheng
    Yao, Guangshun
    Hao, Kuangrong
    INFORMATION SCIENCES, 2017, 393 : 47 - 65
  • [39] SYNTHESIS OF FAULT-TOLERANT DYNAMIC CONTROL-SYSTEMS WITH FAULT IDENTIFICATION
    SALYGA, VI
    SIRODGA, IB
    KULIK, AS
    OBRUCHEV, VL
    PROBLEMS OF CONTROL AND INFORMATION THEORY-PROBLEMY UPRAVLENIYA I TEORII INFORMATSII, 1989, 18 (01): : 43 - 54
  • [40] PAV: Parallel Average Voting Algorithm for Fault-Tolerant Systems
    Karimi, Abbas
    Zarafshan, Faraneh
    Jantan, Adznan B.
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2011, 2 (01) : 38 - 41