Fault Detection for Message Passing Systems

被引:2
|
作者
Karaata, Mehmet Hakan [1 ]
Hamdan, Ali [1 ]
Faisal, Maha H. [1 ]
AlShawan, Feda A. [2 ]
机构
[1] Kuwait Univ, Comp Engn Dept, POB 5969, Safat 13060, Kuwait
[2] Publ Author Appl Educ & Training, Comp Sect, Elect Engn Dept, Kuwait, Kuwait
关键词
Byzantine faults; distributed systems; fault detection; network protocols; node-disjoint paths; NETWORK; SECRECY;
D O I
10.1142/S0218126618500706
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Many crucial dependable and secure services including atomic commitment, consensus and group membership, and middleware services (such as replica, communication and transaction services) use fault detectors. Through the use of fault detectors, the overlying service can be exempted from failure treatment and synchronization requirements. Fault detection is essential for proving that the services carried out are correct. In this paper, we first identify the necessary conditions to detect faults in a message passing system where multiple disjoint paths exist between each pair of endpoints. We then present the first fault detection protocol capable of detecting message meta-data modification in the presence of various message interferences in addition to other faults including omission faults, message replay and spurious messages using disjoint paths, where paths with faults are not known a priori. In addition, it authenticates message origins allowing Sybil attacks to be detected, identifies faulty paths, and classifies faults in the presence of multiple messages sent by various system processes. We establish the completeness and soundness properties of the proposed algorithm, i.e., it detects each considered fault and each detected fault is an actual fault, respectively. We also show that our algorithm does not yield a significant packet size and delay overheads. The algorithm shows the viability of the use of disjoint paths in fault detection.
引用
收藏
页数:26
相关论文
共 50 条
  • [1] Fault-tolerant Snapshot Objects in Message Passing Systems
    Garg, Vijay K.
    Kumar, Saptaparni
    Tseng, Lewis
    Zheng, Xiong
    [J]. 2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 1129 - 1139
  • [2] Common mechanisms for supporting fault tolerance in DSM and message passing systems
    Badrinath, R
    Morin, C
    [J]. Concurrent Information Processing and Computing, 2005, 195 : 175 - 183
  • [3] Fault-tolerant Agreement in Synchronous Message-passing Systems
    Raynal, Michel
    [J]. Synthesis Lectures on Distributed Computing Theory, 2010, 1 (01): : 1 - 189
  • [4] On the interconnection of message passing systems
    Alvarez, A.
    Arvalo, S.
    Cholvi, V.
    Fernandez, A.
    Jimenez, E.
    [J]. INFORMATION PROCESSING LETTERS, 2008, 105 (06) : 249 - 254
  • [5] Fault tolerance in Message Passing Interface programs
    Gropp, W
    Lusk, E
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2004, 18 (03): : 363 - 372
  • [6] Concurrent fault simulation on message passing multicomputers
    Bose, S
    Agrawal, P
    [J]. IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 1998, 6 (02) : 332 - 342
  • [7] Low complexity detection based on selective message passing for SCMA systems
    Wu, Hanguang
    Xiong, Xiaoming
    Gao, Huaien
    [J]. ELECTRONICS LETTERS, 2018, 54 (08) : 533 - 534
  • [8] Correcting errors in message passing systems
    Pedersen, JB
    Wagner, A
    [J]. HIGH-LEVEL PARALLEL PROGRAMMING MODELS AND SUPPORTIVE ENVIRONMENTS, PROCEEDINGS, 2001, 2026 : 122 - 137
  • [9] Region synchronization in message passing systems
    Singh, G
    Su, Y
    [J]. 2002 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING, PROCEEDING, 2002, : 276 - 283
  • [10] Local Message Passing on Frustrated Systems
    Schmid, Luca
    Brenk, Joshua
    Schmalen, Laurent
    [J]. UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 1837 - 1846