Application fault tolerance with armore middleware

被引:22
|
作者
Kalbarczyk, Z [1 ]
Iyer, RK
Wang, L
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/MIC.2005.31
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many current approaches to software-implemented fault tolerance (SIFT) rely on process replication, which is often prohibitively expensive for practical use due to its high performance overhead and cost. The Adaptive Reconfigurable Mobile Objects of Reliability (Armor) middleware architecture offers a scalable low-overhead way to provide high-dependability services to applications. It uses coordinated multithreaded processes to manage redundant resources across interconnected nodes, detect errors in user applications and infrastructural components, and provide failure recovery. The authors describe their experiences and lessons learned in deploying Armor in several diverse fields.
引用
收藏
页码:28 / 37
页数:10
相关论文
共 50 条
  • [31] ROAFTS: A middleware architecture for real-time object-oriented adaptive fault tolerance support
    Kim, KH
    [J]. THIRD IEEE INTERNATIONAL HIGH-ASSURANCE SYSTEMS ENGINEERING SYMPOSIUM, PROCEEDINGS, 1998, : 50 - 57
  • [32] Application level fault tolerance in heterogeneous networks of workstations
    Beguelin, A
    Seligman, E
    Stephan, P
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 43 (02) : 147 - 155
  • [33] Trivariate Bernoulli distribution with application to software fault tolerance
    Lance Fiondella
    Panlop Zeephongsekul
    [J]. Annals of Operations Research, 2016, 244 : 241 - 255
  • [34] Enhancing application robustness through adaptive fault tolerance
    Lan, Zhiling
    Li, Yawei
    Zheng, Ziming
    Gujrati, Prashasta
    [J]. 2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 2455 - 2459
  • [36] Trivariate Bernoulli distribution with application to software fault tolerance
    Fiondella, Lance
    Zeephongsekul, Panlop
    [J]. ANNALS OF OPERATIONS RESEARCH, 2016, 244 (01) : 241 - 255
  • [37] Fault-tolerant middleware for robots
    Baek, BumHyeon
    Park, HongSeong
    [J]. WMSCI 2007: 11TH WORLD MULTI-CONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS, 2007, : 90 - 95
  • [38] Investigation of intrusion tolerance for COTS middleware
    Rathi, M
    Anjum, F
    Zbib, R
    Ghosh, A
    Umar, A
    [J]. 2002 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, VOLS 1-5, CONFERENCE PROCEEDINGS, 2002, : 1169 - 1173
  • [39] A survey of linguistic structures for application-level fault tolerance
    De Florio, Vincenzo
    Blondia, Chris
    [J]. ACM COMPUTING SURVEYS, 2008, 40 (02)
  • [40] Application-level correctness and its impact on fault tolerance
    Li, Xuanhua
    Yeung, Donald
    [J]. THIRTEENTH INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, PROCEEDINGS, 2007, : 181 - +