Automatic model-driven recovery in distributed systems

被引:20
|
作者
Joshi, KR [1 ]
Hiltunen, MA [1 ]
Sanders, WH [1 ]
Schlichting, RD [1 ]
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
关键词
D O I
10.1109/RELDIS.2005.11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Automatic system monitoring and recovery has the potential to provide a low-cost solution for high availability. However automating recovery is difficult in practice because of the challenge of accurate fault diagnosis in the presence of low coverage, poor localization ability, and false positives that are inherent in many widely used monitoring techniques. In this paper we present a holistic model-based approach that overcomes these challenges and enables automatic recovery, in distributed systems. To do so, it uses theoretically sound techniques including Bayesian estimation and Markov decision theory to provide controllers that choose good, if not optimal, recovery actions according to a user-defined optimization criteria. By combining monitoring and recovery, the approach realizes benefits that could not have been obtained by using them in isolation. In this paper we present two recovery algorithms with complementary properties and trade-offs, and validate our algorithms (through simulation) by fault injection on a realistic e-commerce system.
引用
收藏
页码:25 / 36
页数:12
相关论文
共 50 条
  • [1] Probabilistic Model-Driven Recovery in Distributed Systems
    Joshi, Kaustubh R.
    Hiltunen, Matti A.
    Sanders, William H.
    Schlichting, Richard D.
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2011, 8 (06) : 913 - 928
  • [2] Model-driven distributed systems
    Coutts, IA
    Edwards, JM
    [J]. IEEE CONCURRENCY, 1997, 5 (03): : 55 - &
  • [3] Research of Model-driven Distributed Automatic Test Execution Framework
    Liu, X. -M.
    Liu, Y. -P.
    Liu, S. -M.
    Wu, Ji
    [J]. 2011 AASRI CONFERENCE ON APPLIED INFORMATION TECHNOLOGY (AASRI-AIT 2011), VOL 1, 2011, : 13 - 18
  • [4] Dynamic Adaptation for Distributed Systems in Model-Driven Engineering
    Mohammed, Mufasir Muthaher
    [J]. ACM/IEEE 25TH INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS, MODELS 2022 COMPANION, 2022, : 146 - 151
  • [5] Model-driven scheduling for distributed stream processing systems
    Shukla, Anshu
    Simmhan, Yogesh
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2018, 117 : 98 - 114
  • [6] OpenPMF: A model-driven security framework for distributed systems
    Lang, U
    Schreiner, R
    [J]. ISSE 2004 - SECURING ELECTRONIC BUSINESS PROCESSES, 2004, : 138 - 147
  • [7] Model-driven engineering of middleware-mediated distributed systems
    Silaghi, R
    Strohmeier, A
    [J]. UML MODELING LANGUAGES AND APPLICATIONS, 2005, 3297 : 259 - 263
  • [8] A Model-Driven Approach to Enable the Distributed Simulation of Complex Systems
    Bocciarelli, Paolo
    D'Ambrogio, Andrea
    Falcone, Alberto
    Garro, Alfredo
    Giglio, Andrea
    [J]. COMPLEX SYSTEMS DESIGN & MANAGEMENT (CSD&M 2015), 2016, : 171 - 183
  • [9] Preseving distributed systems' critical properties: A model-driven approach
    Yilmaz, C
    Memon, AN
    Porter, AA
    Krishna, AS
    Schmidt, DC
    Gokhale, A
    Natarajan, B
    [J]. IEEE SOFTWARE, 2004, 21 (06) : 32 - +
  • [10] Model-Driven Design of Network Aspects of Distributed Embedded Systems
    Ebeid, Emad
    Fummi, Franco
    Quaglia, Davide
    [J]. IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2015, 34 (04) : 603 - 614