Automated Fault-Tolerance Testing

被引:2
|
作者
Nagarajan, Adithya [1 ]
Vaddadi, Ajay [1 ]
机构
[1] Groupon, Chicago, IL 60654 USA
关键词
D O I
10.1109/ICSTW.2016.34
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Software Fault Tolerance is an ability of computer software to continue its normal operation despite the presence of system or hardware faults. Most companies are moving towards a microservices-based architecture where complex applications are developed with a suite of small services, each of which communicates using some common protocols like Hypertext Transfer Protocol (HTTP). While this architecture enables agility in software development and go-to-market, it poses a critical challenge of assessing fault tolerance and resiliency of the overall system. A failure in one of the dependent services can cause an unexpected impact on the upstream services causing severe customer facing issues. Such issues are a result of lack of resiliency in the architecture of the system. There is a need for an automated tool to be able to understand the service architecture, topology, and be able to inject faults to assess fault tolerance and resiliency of the system. In this paper, we present Screwdriver a new automated solution developed at Groupon to address this need.
引用
收藏
页码:275 / 276
页数:2
相关论文
共 50 条
  • [1] FAULT-TOLERANCE IN AUTOMATED MANUFACTURING SYSTEMS
    MENDIGUTXIA, J
    ZUBIZARRETA, P
    GOENAGA, JM
    BERASATEGUI, L
    MANERO, L
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 1995, 8 (02) : 275 - 285
  • [2] Automated analysis of fault-tolerance in distributed systems
    Stoller, SD
    Schneider, FB
    [J]. FORMAL METHODS IN SYSTEM DESIGN, 2005, 26 (02) : 183 - 196
  • [3] Automated Analysis of Fault-Tolerance in Distributed Systems
    Scott D. Stoller
    Fred B. Schneider
    [J]. Formal Methods in System Design, 2005, 26 : 183 - 196
  • [4] Algorithms for testing fault-tolerance of sequenced jobs
    Chrobak, Marek
    Hurand, Mathilde
    Sgall, Jiri
    [J]. JOURNAL OF SCHEDULING, 2009, 12 (05) : 501 - 515
  • [5] Algorithms for testing fault-tolerance of sequenced jobs
    Marek Chrobak
    Mathilde Hurand
    Jiří Sgall
    [J]. Journal of Scheduling, 2009, 12 : 501 - 515
  • [6] Automated Addition of Fault-Tolerance under Synchronous Semantics
    Lin, Yiyan
    Bonakdarpour, Borzoo
    Kulkarni, Sandeep
    [J]. STABILIZATION, SAFETY, AND SECURITY OF DISTRIBUTED SYSTEMS, SSS 2013, 2013, 8255 : 266 - 280
  • [7] Automated stream-based analysis of fault-tolerance
    Stoller, SD
    Schneider, FB
    [J]. FORMAL TECHNIQUES IN REAL-TIME AND FAULT-TOLERANT SYSTEMS, 1998, 1486 : 113 - 122
  • [8] Complexity issues in automated synthesis of failsafe fault-tolerance
    Kulkarni, SS
    Ebnenasir, A
    [J]. IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, 2005, 2 (03) : 201 - 215
  • [9] Weakest Invariant Generation for Automated Addition of Fault-Tolerance
    Abujarad, Fuad
    Kulkarni, Sandeep S.
    [J]. ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE, 2009, 258 (02) : 3 - 15
  • [10] FAULT-TOLERANCE
    GROSSPIETSCH, KE
    [J]. MICROPROCESSING AND MICROPROGRAMMING, 1993, 38 (1-5): : 783 - 783