Automatic model-driven recovery in distributed systems

被引:20
|
作者
Joshi, KR [1 ]
Hiltunen, MA [1 ]
Sanders, WH [1 ]
Schlichting, RD [1 ]
机构
[1] Univ Illinois, Coordinated Sci Lab, Urbana, IL 61801 USA
关键词
D O I
10.1109/RELDIS.2005.11
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Automatic system monitoring and recovery has the potential to provide a low-cost solution for high availability. However automating recovery is difficult in practice because of the challenge of accurate fault diagnosis in the presence of low coverage, poor localization ability, and false positives that are inherent in many widely used monitoring techniques. In this paper we present a holistic model-based approach that overcomes these challenges and enables automatic recovery, in distributed systems. To do so, it uses theoretically sound techniques including Bayesian estimation and Markov decision theory to provide controllers that choose good, if not optimal, recovery actions according to a user-defined optimization criteria. By combining monitoring and recovery, the approach realizes benefits that could not have been obtained by using them in isolation. In this paper we present two recovery algorithms with complementary properties and trade-offs, and validate our algorithms (through simulation) by fault injection on a realistic e-commerce system.
引用
收藏
页码:25 / 36
页数:12
相关论文
共 50 条
  • [21] Model-Driven Development of Distributed Ledger Applications
    Fraternali, Piero
    Gonzalez, Sergio Luis Herrera
    Frigerio, Matteo
    Righetti, Mattia
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS. DASFAA 2022 INTERNATIONAL WORKSHOPS, 2022, 13248 : 104 - 119
  • [22] A Model-driven Workflow for Distributed Microservice Development
    Rademacher, Florian
    Sorgalla, Jonas
    Sachweh, Sabine
    Zuendorf, Albert
    [J]. SAC '19: PROCEEDINGS OF THE 34TH ACM/SIGAPP SYMPOSIUM ON APPLIED COMPUTING, 2019, : 1260 - 1262
  • [23] Virtualization for Testing in Model-driven Distributed System
    Kim, Youngheum
    Lee, Seungyong
    Kim, Seungbeom
    [J]. 2012 IEEE 75TH VEHICULAR TECHNOLOGY CONFERENCE (VTC SPRING), 2012,
  • [24] Formal Model-Driven Design of Distributed Algorithms
    Kuhnrich, Morten
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2009, 251 : 49 - 64
  • [25] MODEL-DRIVEN PERFORMANCE PREDICTION OF HLA-BASED DISTRIBUTED SIMULATION SYSTEMS
    Gianni, Daniele
    Bocciarelli, Paolo
    D'Ambrogio, Andrea
    [J]. 2012 WINTER SIMULATION CONFERENCE (WSC), 2012,
  • [26] UML-Based Modeling and Model-Driven Development of Distributed Control Systems
    Basile, Francesco
    Chiacchio, Pasquale
    Del Grosso, Domenico
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES AND FACTORY AUTOMATION, PROCEEDINGS, 2008, : 1120 - 1127
  • [27] A model-driven framework for the generation of gateways in distributed real-time systems
    Obermaisser, R.
    [J]. RTSS 2007: 28TH IEEE INTERNATIONAL REAL-TIME SYSTEMS SYMPOSIUM, PROCEEDINGS, 2007, : 93 - 104
  • [28] Model-driven Performance Prediction of Systems of Systems
    Falkner, Katrina
    Szabo, Claudia
    Chiprianov, Vanea
    [J]. 19TH ACM/IEEE INTERNATIONAL CONFERENCE ON MODEL DRIVEN ENGINEERING LANGUAGES AND SYSTEMS (MODELS'16), 2016, : 44 - 44
  • [29] Model-driven performance prediction of systems of systems
    Falkner, Katrina
    Szabo, Claudia
    Chiprianov, Vanea
    Puddy, Gavin
    Rieckmann, Marianne
    Fraser, Dan
    Aston, Cathlyn
    [J]. SOFTWARE AND SYSTEMS MODELING, 2018, 17 (02): : 415 - 441
  • [30] Model-driven performance prediction of systems of systems
    Katrina Falkner
    Claudia Szabo
    Vanea Chiprianov
    Gavin Puddy
    Marianne Rieckmann
    Dan Fraser
    Cathlyn Aston
    [J]. Software & Systems Modeling, 2018, 17 : 415 - 441