Coding techniques for fault-tolerant parallel prefix computations in Abelian groups

被引:1
|
作者
Hadjicostis, CN
机构
[1] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
[2] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
来源
COMPUTER JOURNAL | 2004年 / 47卷 / 03期
关键词
D O I
10.1093/comjnl/47.3.329
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents coding techniques that can be used to provide fault tolerance to a parallel prefix computation that is performed on a binary tree of processing nodes. More specifically, we discuss how a parallel prefix computation in an arbitrary Abelian group can be protected using group homomorphisms. The proposed approach is general enough to handle a variety of group operations of interest and allows for designs ranging from simple parity schemes to full replication. Error detecting and correcting mechanisms are used solely at the leaf nodes and can capture faults at any node or link within the binary tree architecture on which the parallel prefix computation is performed. Furthermore, by tracking the propagation of errors in the binary tree, our method can identify a processing node that has permanently failed based on information from simple error detecting mechanisms at the leaf nodes.
引用
收藏
页码:329 / 341
页数:13
相关论文
共 50 条
  • [1] A Markov model for fault-tolerant task parallel computations
    Bertolli, Carlo
    Meneghin, Massimiliano
    Gabarro, Joaquim
    [J]. FROM GRIDS TO SERVICE AND PERVASIVE COMPUTING, 2008, : 123 - +
  • [2] Efficient Coding Schemes for Fault-Tolerant Parallel Filters
    Gao, Zhen
    Reviriego, Pedro
    Xu, Zhan
    Su, Xin
    Wang, Jing
    Antonio Maestro, Juan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2015, 62 (07) : 666 - 670
  • [3] ON FAULT-TOLERANT SYMBOLIC COMPUTATIONS
    DELYON, B
    MALER, O
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1991, 571 : 259 - 269
  • [4] SCHEDULING SAVES IN FAULT-TOLERANT COMPUTATIONS
    COFFMAN, EG
    FLATTO, L
    KREININ, AY
    [J]. ACTA INFORMATICA, 1993, 30 (05) : 409 - 423
  • [5] FAULT-TOLERANT PARALLEL PROCESSOR
    HARPER, RE
    LALA, JH
    [J]. JOURNAL OF GUIDANCE CONTROL AND DYNAMICS, 1991, 14 (03) : 554 - 563
  • [6] A Fault-Tolerant Handshake Algorithm for Local Computations
    Fontaine, Allyx
    Mosbah, Mohamed
    Tounsi, Mohamed
    Zemmari, Akka
    [J]. IEEE 30TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA 2016), 2016, : 475 - 480
  • [7] Fault-tolerant techniques for nanocomputers
    Nikolic, K
    Sadek, A
    Forshaw, M
    [J]. NANOTECHNOLOGY, 2002, 13 (03) : 357 - 362
  • [8] Fault-Tolerant Coding for Quantum Communication
    Christandl, Matthias
    Mueller-Hermes, Alexander
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 2024, 70 (01) : 282 - 317
  • [9] Fault-Tolerant Computation Meets Network Coding: Optimal Scheduling in Parallel Computing
    Li, Congduan
    Zhang, Yiqian
    Tan, Chee Wei
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 2023, 71 (07) : 3847 - 3860
  • [10] Fault-Tolerant Computation Meets Network Coding: Optimal Scheduling in Parallel Computing
    Li, Congduan
    Tan, Chee Wei
    Li, Jingting
    Chen, Siya
    [J]. 2021 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2021,