On the reliability of large-scale distributed systems - A topological view

被引:9
|
作者
He, Yuan [1 ]
Ren, Hao [2 ]
Liu, Yunhao [1 ]
Yang, Baijian [3 ]
机构
[1] Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Hong Kong, Hong Kong, Peoples R China
[2] Natl Univ Def Technol, Coll Comp, Changsha 410073, Hunan, Peoples R China
[3] Ball State Univ, Dept Technol, Muncie, IN 47306 USA
关键词
Cut vertex; Peer-to-peer; Reliability; Detection; Distributed method;
D O I
10.1016/j.comnet.2009.03.012
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In large-scale, self-organized distributed systems, such as peer-to-peer (P2P) overlays and wireless sensor networks (WSN), a small proportion of the nodes are likely to be more critical to the system's reliability than others. This paper focuses on detecting cut vertices so that we can either neutralize or protect these critical nodes. Detection of cut vertices is trivial if the global knowledge of the whole system is known but it is very challenging when the global knowledge is not available. In this paper, we propose a completely distributed scheme where every single node can determine whether it is a cut vertex or not. In addition, our design can also confine the detection overhead to a constant instead of being proportional to the size of a network. The correctness of this algorithm is theoretically proved and the key performance gains are measured and verified through trace-driven simulations. (C) 2009 Elsevier B.V. All rights reserved.
引用
收藏
页码:2140 / 2152
页数:13
相关论文
共 50 条
  • [1] Legal reliability in large-scale distributed systems
    Sommer, P
    [J]. SEVENTEENTH IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 416 - 421
  • [2] Approximate Reliability Evaluation of Large-Scale Distributed Systems
    Mo, Yuchang
    Han, Jianmin
    Zhang, Zhizheng
    Pan, Zhusheng
    Zhong, Farong
    [J]. JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2014, 30 (01) : 25 - 41
  • [3] Topological Pilot Assignment in Large-Scale Distributed MIMO Networks
    Yu, Han
    Yi, Xinping
    Caire, Giuseppe
    [J]. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, 2022, 21 (08) : 6141 - 6155
  • [4] LARGE-SCALE SYSTEMS - STABILITY, COMPLEXITY, RELIABILITY
    SILJAK, DD
    VUKCEVIC, MB
    [J]. JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS, 1976, 301 (1-2): : 49 - 69
  • [5] ACHIEVING RELIABILITY IN LARGE-SCALE SOFTWARE SYSTEMS
    SCHICK, GJ
    WOLVERTON, RW
    [J]. PROCEEDINGS ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, 1974, 7 (02): : 302 - 319
  • [6] Expectations and challenges in large-scale distributed systems
    Bacon, J
    [J]. IEEE CONCURRENCY, 2000, 8 (01): : 2 - 3
  • [7] Failure detectors for large-scale distributed systems
    Hayashibara, N
    Cherif, A
    Katayama, T
    [J]. 21ST IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 404 - 409
  • [8] A dependability layer for large-scale distributed systems
    Cristea, Valentin
    Dobre, C.
    Pop, F.
    Stratan, C.
    Costan, A.
    Leordeanu, C.
    Tirsa, E.
    [J]. INTERNATIONAL JOURNAL OF GRID AND UTILITY COMPUTING, 2011, 2 (02) : 109 - 118
  • [9] Independent recovery in large-scale distributed systems
    Triantafillou, P
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1996, 22 (11) : 812 - 826
  • [10] Energy efficiency in large-scale distributed systems
    Tuan Anh Trinh
    Hlavacs, Helmut
    Talia, Domenico
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF GRID COMPUTING AND ESCIENCE, 2012, 28 (05): : 743 - 744