Customizable fault tolerance for wide-area replication

被引:15
|
作者
Amir, Yair [1 ]
Coan, Brian [2 ]
Kirsch, Jonathan [1 ]
Lane, John [1 ]
机构
[1] Johns Hopkins Univ, Baltimore, MD 21218 USA
[2] Telcordia Technol, Piscataway, NJ USA
来源
SRDS 2007: 26TH IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS | 2007年
基金
美国国家科学基金会;
关键词
D O I
10.1109/SRDS.2007.29
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Constructing logical machines out of collections of physical machines is a well-known technique for improving the robustness and fault tolerance of distributed systems. We present a new, scalable replication architecture, built upon logical machines specifically designed to perform well in wide-area systems spanning multiple sites. The physical machines in each site implement a logical machine by running a local state machine replication protocol, and a wide-area replication protocol runs among the logical machines. Implementing logical machines via the state machine approach affords free substitution of the fault tolerance method used in each site and in the wide-area replication protocol, allowing one to balance performance and fault tolerance based on perceived risk. We present a new Byzantine fault-tolerant protocol that establishes a reliable virtual communication link between logical machines. Our communication protocol is efficient (a necessity in wide-area environments), avoiding the need for redundant message sending during normal-case operation and allowing a logical machine to consume approximately the same wide-area bandwidth as a single physical machine. This dramatically improves the wide-area performance of our system compared to existing logical machine based approaches. We implemented a prototype system and compare its performance and fault tolerance to existing solutions.
引用
收藏
页码:66 / +
页数:3
相关论文
共 50 条
  • [1] Exploiting data-flow for fault-tolerance in a wide-area parallel system
    NguyenTuong, A
    Grimshaw, AS
    Hyett, M
    15TH SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1996, : 2 - 11
  • [2] Fault tolerant wide-area parallel computing
    Weissman, JB
    PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 2000, 1800 : 1214 - 1225
  • [3] Wide-area replication support for global data repositories
    Decker, H
    Irún-Briz, L
    de Juan-Marín, R
    Armendáriz, JE
    Muñoz-Escoí, F
    Sixteenth International Workshop on Database and Expert Systems Applications, Proceedings, 2005, : 1117 - 1121
  • [4] Edge Replication Strategies for Wide-Area Distributed Processing
    Semmler, Niklas
    Rost, Matthias
    Smaragdakis, Georgios
    Feldmann, Anja
    PROCEEDINGS OF THE THIRD ACM INTERNATIONAL WORKSHOP ON EDGE SYSTEMS, ANALYTICS AND NETWORKING (EDGESYS'20), 2020, : 1 - 6
  • [5] A Novel Wide-area Fault Location Algorithm Based on Fault Model
    Ma, Jing
    Li, Jin-long
    Wang, Zeng-ping
    Yang, Qi-Xun
    2010 ASIA-PACIFIC POWER AND ENERGY ENGINEERING CONFERENCE (APPEEC), 2010,
  • [6] A novel wide-area fault location algorithm based on fault model
    Ma, Jing
    Li, Jin-Long
    Li, Jin-Hui
    Yang, Qi-Xun
    Wang, Zeng-Ping
    Dianli Xitong Baohu yu Kongzhi/Power System Protection and Control, 2010, 38 (20): : 74 - 78
  • [7] Fault Location Method Based on Wide-Area Voltage
    Xu, Yan
    Ying, Lu-man
    Zhi, Jing
    Feng, Ren-qing
    ENERGY DEVELOPMENT, PTS 1-4, 2014, 860-863 : 2077 - +
  • [8] Fault identification scheme for wide-area backup protection
    Wang, Yan
    Jin, Jing
    Jiao, Yanjun
    Dianli Zidonghua Shebei/Electric Power Automation Equipment, 2014, 34 (12): : 70 - 75
  • [9] Extending wide-area replication support with mobility and improved recovery
    Decker, H
    Irún-Briz, L
    Castro-Company, F
    García-Neiva, F
    Muñoz-Escof, FD
    ADVANCED DISTRIBUTED SYSTEMS, 2005, 3563 : 10 - 20
  • [10] Taming aggressive replication in the Pangaea wide-area file system
    Saito, Y
    Karamanolis, C
    Karlsson, M
    Mahalingam, M
    USENIX ASSOCIATION PROCEEDINGS OF THE FIFTH SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION, 2002, : 15 - 30