Distributed Diagnosis of Dynamic Events in Partitionable Arbitrary Topology Networks

被引:11
|
作者
Duarte, Elias P., Jr. [1 ]
Weber, Andrea [1 ]
Ono Fonseca, Keiko V. [2 ]
机构
[1] Univ Fed Parana, Dept Informat, BR-81531980 Curitiba Pr, Brazil
[2] Fed Univ Technol, Dept Informat, BR-80230901 Curitiba Pr, Brazil
关键词
Network reachability; distributed diagnosis; multiprocessor systems; dynamic fault diagnosis; bounded correctness; UNRELIABLE FAILURE DETECTORS; GROUP COMMUNICATION; DIAGNOSABILITY; IMPLEMENTATION;
D O I
10.1109/TPDS.2011.284
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This work introduces the Distributed Network Reachability (DNR) algorithm, a distributed system-level diagnosis algorithm that allows every node of a partitionable arbitrary topology network to determine which portions of the network are reachable and unreachable. DNR is the first distributed diagnosis algorithm that works in the presence of network partitions and healings caused by dynamic fault and repair events. Both crash and timing faults are assumed, and a faulty node is indistinguishable of a network partition. Every link is alternately tested by one of its adjacent nodes at subsequent testing intervals. Upon the detection of a new event, the new diagnostic information is disseminated to reachable nodes. New events can occur before the dissemination completes. Any time a new event is detected or informed, a working node may compute the network reachability using local diagnostic information. The bounded correctness of DNR is proved, including the bounded diagnostic latency, bounded startup and accuracy. Simulation results are presented for several random and regular topologies, showing the performance of the algorithm under highly dynamic fault situations.
引用
收藏
页码:1415 / 1426
页数:12
相关论文
共 50 条
  • [1] Distributed Synchronization of Heterogeneous Oscillators on Networks With Arbitrary Topology
    Mallada, Enrique
    Freeman, Randy A.
    Tang, Ao Kevin
    [J]. IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2016, 3 (01): : 12 - 23
  • [2] Distributed topology control of dynamic networks
    Zavlanos, Michael M.
    Tahbaz-Salehi, Alireza
    Jadbabaie, Ali
    Pappas, George J.
    [J]. 2008 AMERICAN CONTROL CONFERENCE, VOLS 1-12, 2008, : 2660 - 2665
  • [3] RETRACTED: A dynamic distributed diagnosis algorithm for an arbitrary network topology with unreliable nodes and links (Retracted Article)
    Khilar, Pabitra Mohan
    Mahapatra, S.
    [J]. ADCOM 2007: PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING AND COMMUNICATIONS, 2007, : 125 - +
  • [4] Formalising reconciliation in partitionable networks with distributed services
    Asplund, Mikael
    Nadjm-Tehrani, Simin
    [J]. RIGOROUS DEVELOPMENT OF COMPLEX FAULT-TOLERANT SYSTEMS, 2006, 4157 : 37 - +
  • [5] A Distributed Framework for Task Offloading in Edge Computing Networks of Arbitrary Topology
    Liu, Boxi
    Cao, Yang
    Zhang, Yue
    Jiang, Tao
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2020, 19 (04) : 2855 - 2867
  • [6] Distributed Constrained Minimum-Time Schedules in Networks of Arbitrary Topology
    Jackson, Justin
    Faied, Mariam
    Kabamba, Pierre
    Girard, Anouck
    [J]. IEEE TRANSACTIONS ON ROBOTICS, 2013, 29 (02) : 554 - 563
  • [7] DISTRIBUTED TOPOLOGY IDENTIFICATION FOR POINT PROCESS DYNAMIC NETWORKS
    Pasha, Syed Ahmed
    Solo, Victor
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 3681 - 3685
  • [8] Distributed diagnosis in dynamic fault environments for arbitrary network topologies
    Khilar, PM
    Mahapatra, S
    [J]. INDICON 2005 PROCEEDINGS, 2005, : 56 - 59
  • [9] Distributed Topology Identification for Sparse Point Process Dynamic Networks
    Pasha, Syed Ahmed
    Solo, Victor
    [J]. 2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 3379 - 3384
  • [10] An algorithm for distributed hierarchical diagnosis of dynamic fault and repair events
    Duarte, EP
    Brawerman, A
    Albini, LCP
    [J]. SEVENTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2000, : 299 - 306