A case for two-level recovery schemes

被引:31
|
作者
Vaidya, NH [1 ]
机构
[1] Texas A&M Univ, Dept Comp Sci, College Stn, TX 77843 USA
基金
美国国家科学基金会;
关键词
failure recovery; performance analysis; checkpointing and rollback; recovery overhead; Markov chains;
D O I
10.1109/12.689645
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Long-running applications are often subject to failures. Failures can result in significant loss of computation. Therefore, it is necessary to use a failure recovery scheme to minimize performance overhead in the presence of failures. In this paper, we argue that it is often advantageous to use "two-level" recovery schemes. A two-level recovery scheme tolerates the more probable failures with low performance overhead, while the less probable failures may possibly incur a higher overhead. By minimizing overhead for the more frequently occurring failure scenarios, the two-level approach can achieve lower performance overhead (on average) as compared to existing recovery schemes. The paper describes two two-level recovery schemes. Performance analysis using a Markov chain shows that, in practice, a two-level scheme can perform better than its "one-level" counterpart. While the conclusions of this paper are intuitive, the work on design of appropriate recovery schemes is lacking. The objective of this paper is to motivate research into recovery schemes that can provide multiple levels of fault tolerance and achieve better performance than existing recovery schemes. The paper presents an analytical approach for evaluating performance of two-level schemes and shows that such schemes are hard to optimize analytically.
引用
收藏
页码:656 / 666
页数:11
相关论文
共 50 条
  • [1] Optimal checkpointing interval for two-level recovery schemes
    Naruse, K
    Umemura, S
    Nakagawa, S
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2006, 51 (02) : 371 - 376
  • [2] Two-level schemes for the advection equation
    Vabishchevich, Petr N.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2018, 363 : 158 - 177
  • [3] Replacement schemes and two-level tables
    Breuker, DM
    Uiterwijk, JWHM
    vandenHerik, HJ
    ICCA JOURNAL, 1996, 19 (03): : 175 - 180
  • [4] Factorized SM-stable two-level schemes
    Vabishchevich, P. N.
    COMPUTATIONAL MATHEMATICS AND MATHEMATICAL PHYSICS, 2010, 50 (11) : 1818 - 1824
  • [5] Factorized SM-stable two-level schemes
    P. N. Vabishchevich
    Computational Mathematics and Mathematical Physics, 2010, 50 : 1818 - 1824
  • [6] A two-level threshold recovery mechanism for SCTP
    Caro, AL
    Iyengar, JR
    Amer, PD
    Heinz, GJ
    Stewart, RR
    6TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL X, PROCEEDINGS: MOBILE/WIRELESS COMPUTING AND COMMUNICATION SYSTEMS II, 2002, : 341 - 346
  • [7] Two-Level Game Semantics, Intersection Types, and Recursion Schemes
    Ong, C. -H. Luke
    Tsukada, Takeshi
    AUTOMATA, LANGUAGES, AND PROGRAMMING, ICALP 2012, PT II, 2012, 7392 : 325 - 336
  • [8] Choice of optimal blocking schemes in two-level and three-level designs
    Cheng, SW
    Wu, CFJ
    TECHNOMETRICS, 2002, 44 (03) : 269 - 277
  • [9] Nonstandard finite-difference schemes for the two-level Bloch model
    Songolo, Marc E.
    Bidegaray-Fesquet, Brigitte
    INTERNATIONAL JOURNAL OF MODELING SIMULATION AND SCIENTIFIC COMPUTING, 2018, 9 (04)
  • [10] On data communication in two-level schemes of optically coupled chaotic lasers
    Ledenev, V.I.
    Kvantovaya Elektronika, 2004, 34 (10): : 965 - 968