Application fault tolerance with armore middleware

被引:22
|
作者
Kalbarczyk, Z [1 ]
Iyer, RK
Wang, L
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
[2] Univ Illinois, Dept Elect & Comp Engn, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/MIC.2005.31
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Many current approaches to software-implemented fault tolerance (SIFT) rely on process replication, which is often prohibitively expensive for practical use due to its high performance overhead and cost. The Adaptive Reconfigurable Mobile Objects of Reliability (Armor) middleware architecture offers a scalable low-overhead way to provide high-dependability services to applications. It uses coordinated multithreaded processes to manage redundant resources across interconnected nodes, detect errors in user applications and infrastructural components, and provide failure recovery. The authors describe their experiences and lessons learned in deploying Armor in several diverse fields.
引用
收藏
页码:28 / 37
页数:10
相关论文
共 50 条
  • [41] Fault Tolerance in an Industrial Seismic Processing Application for Multicore Clusters
    Goncalves, Alexandre
    Bersot, Matheus
    Bulcao, Andre
    Boeres, Cristina
    Drummond, Lucia
    Rebello, Vinod
    [J]. RECENT ADVANCES IN THE MESSAGE PASSING INTERFACE, 2011, 6960 : 264 - +
  • [42] TFT: A software system for application-transparent fault tolerance
    Bressoud, TC
    [J]. TWENTY-EIGHTH ANNUAL INTERNATIONAL SYMPOSIUM ON FAULT-TOLERANT COMPUTING, DIGEST PAPERS, 1998, : 128 - 137
  • [43] Reliability Programmed Tool and Its Application for Fault Tolerance Computation
    Singh, N. S. S.
    Hamid, N. H.
    Asirvadam, V. S.
    [J]. MANUFACTURING AND APPLIED RESEARCH, 2014, 909 : 397 - 404
  • [44] Application of Regenerating Codes for Fault Tolerance in Distributed Storage Systems
    Peter, Kathrin
    Sobe, Peter
    [J]. 2012 11TH IEEE INTERNATIONAL SYMPOSIUM ON NETWORK COMPUTING AND APPLICATIONS (NCA), 2012, : 67 - 70
  • [45] Assessing the dependability of OGSA middleware by fault injection
    Looker, N
    Xu, J
    [J]. 22ND INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2003, : 293 - 302
  • [46] A study of unpredictability in fault-tolerant middleware
    Dumitras, Tudor
    Narasimhan, Priya
    [J]. COMPUTER NETWORKS, 2013, 57 (03) : 682 - 698
  • [47] Fault-tolerant middleware and the magical 1%
    Dumitras, T
    Narasimhan, P
    [J]. MIDDLEWARE 2005, PROCEEDINGS, 2005, 3790 : 431 - 441
  • [48] FAULT TOLERANCE
    TAZELAAR, JM
    [J]. BYTE, 1991, 16 (08): : 173 - 173
  • [49] FAULT TOLERANCE
    DALCIN, M
    [J]. MICROPROCESSING AND MICROPROGRAMMING, 1989, 27 (1-5): : 695 - 695
  • [50] FAULT TOLERANCE
    不详
    [J]. COMPUTER DECISIONS, 1984, 16 (14): : 34 - 34