Self-Adapting Reliability in Distributed Software Systems

被引:10
|
作者
Brun, Yuriy [1 ]
Bang, Jae Young [2 ]
Edwards, George [2 ]
Medvidovic, Nenad [2 ]
机构
[1] Univ Massachusetts, Sch Comp Sci, Amherst, MA 01003 USA
[2] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
基金
美国国家科学基金会;
关键词
Redundancy; reliability; fault-tolerance; iterative redundancy; self-adaptation; optimal redundancy; TECHNOLOGY; TOLERANCE;
D O I
10.1109/TSE.2015.2412134
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Developing modern distributed software systems is difficult in part because they have little control over the environments in which they execute. For example, hardware and software resources on which these systems rely may fail or become compromised and malicious. Redundancy can help manage such failures and compromises, but when faced with dynamic, unpredictable resources and attackers, the system reliability can still fluctuate greatly. Empowering the system with self-adaptive and self-managing reliability facilities can significantly improve the quality of the software system and reduce reliance on the developer predicting all possible failure conditions. We present iterative redundancy, a novel approach to improving software system reliability by automatically injecting redundancy into the system's deployment. Iterative redundancy self-adapts in three ways: (1) by automatically detecting when the resource reliability drops, (2) by identifying unlucky parts of the computation that happen to deploy on disproportionately many compromised resources, and (3) by not relying on a priori estimates of resource reliability. Further, iterative redundancy is theoretically optimal in its resource use: Given a set of resources, iterative redundancy guarantees to use those resources to produce the most reliable version of that software system possible; likewise, given a desired increase in the system's reliability, iterative redundancy guarantees achieving that reliability using the least resources possible. Iterative redundancy handles even the Byzantine threat model, in which compromised resources collude to attack the system. We evaluate iterative redundancy in three ways. First, we formally prove its self-adaptation, efficiency, and optimality properties. Second, we simulate it at scale using discrete event simulation. Finally, we modify the existing, open-source, volunteer-computing BOINC software system and deploy it on the globally-distributed PlanetLab testbed network to empirically evaluate that iterative redundancy is self-adaptive and more efficient than existing techniques.
引用
收藏
页码:764 / 780
页数:17
相关论文
共 50 条
  • [1] SELF-ADAPTING MENUS FOR CAD SOFTWARE
    GINSBURG, S
    [J]. COMPUTERS & STRUCTURES, 1986, 23 (04) : 475 - 479
  • [2] Creating Self-Adapting Mobile Systems with Dynamic Software Product Lines
    Gamez, Nadia
    Fuentes, Lidia
    Troya, Jose M.
    [J]. IEEE SOFTWARE, 2015, 32 (02) : 104 - 111
  • [3] Self-adapting numerical software (SANS) effort
    Dongarra, J
    Bosilca, G
    Chen, Z
    Eijkhout, V
    Fagg, GE
    Fuentes, E
    Langou, J
    Luszczek, P
    Pjesivac-Grbovic, J
    Seymour, K
    You, H
    Vadhiyar, SS
    [J]. IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2006, 50 (2-3) : 223 - 238
  • [4] Self-adapting resource bounded distributed computations
    Jamali, Nadeem
    Zhao, Xinghui
    [J]. FIRST IEEE INTERNATIONAL CONFERENCE ON SELF-ADAPTIVE AND SELF-ORGANIZING SYSTEMS, 2007, : 311 - +
  • [5] Self-adapting linear algebra algorithms and software
    Demmel, J
    Dongarra, J
    Eijkhout, V
    Fuentes, E
    Petitet, A
    Vuduc, R
    Whaley, RC
    Yelick, K
    [J]. PROCEEDINGS OF THE IEEE, 2005, 93 (02) : 293 - 312
  • [6] Self-adapting numerical software for next generation applications
    Dongarra, J
    Eijkhout, V
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2003, 17 (02): : 125 - 131
  • [7] The Component Structure of a Self-Adapting Numerical Software System
    Victor Eijkhout
    Erika Fuentes
    Thomas Eidson
    Jack Dongarra
    [J]. International Journal of Parallel Programming, 2005, 33 : 137 - 143
  • [8] The component structure of a Self-Adapting Numerical Software system
    Eijkhout, V
    Fuentes, E
    Eidson, T
    Dongarra, J
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2005, 33 (2-3) : 137 - 143
  • [9] Self-adapting numerical software and automatic tuning of heuristics
    Dongarra, J
    Eijkhout, V
    [J]. COMPUTATIONAL SCIENCE - ICCS 2003, PT IV, PROCEEDINGS, 2003, 2660 : 759 - 767
  • [10] Self-monitoring and self-adapting operating systems
    Seltzer, M
    Small, C
    [J]. SIXTH WORKSHOP ON HOT TOPICS IN OPERATING SYSTEMS, PROCEEDINGS, 1997, : 124 - 129