A Fault-Tolerance Shim for Serverless Computing

被引:37
|
作者
Sreekanti, Vikram [1 ]
Wu, Chenggang [1 ]
Chhatrapati, Saurav [1 ]
Gonzalez, Joseph E. [1 ]
Hellerstein, Joseph M. [1 ]
Faleiro, Jose M. [2 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Microsoft Res, Redmond, WA USA
关键词
D O I
10.1145/3342195.3387535
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Serverless computing has grown in popularity in recent years, with an increasing number of applications being built on Functions-as-a-Service (FaaS) platforms. By default, FaaS platforms support retry-based fault tolerance, but this is insufficient for programs that modify shared state, as they can unwittingly persist partial sets of updates in case of failures. To address this challenge, we would like atomic visibility of the updates made by a FaaS application. In this paper, we present AFT, an atomic fault tolerance shim for serverless applications. AFT interposes between a commodity FaaS platform and storage engine and ensures atomic visibility of updates by enforcing the read atomic isolation guarantee. AFT supports new protocols to guarantee read atomic isolation in the serverless setting. We demonstrate that AFT introduces minimal overhead relative to existing storage engines and scales smoothly to thousands of requests per second, while preventing a significant number of consistency anomalies.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Fault-Tolerance in the Scope of Cloud Computing
    Rehman, A. U.
    Aguiar, Rui L.
    Barraca, Joao Paulo
    [J]. IEEE ACCESS, 2022, 10 : 63422 - 63441
  • [2] A new fault-tolerance framework for grid computing
    Derbal, Youcef
    [J]. MULTIAGENT AND GRID SYSTEMS, 2006, 2 (02) : 115 - 133
  • [3] METHODS AND MODELS FOR COMPUTING SURVIVABILITY AND FAULT-TOLERANCE OF A NETWORK
    GAGIN, AA
    [J]. MICROELECTRONICS AND RELIABILITY, 1993, 33 (10): : 1533 - 1552
  • [4] FAULT-TOLERANCE
    GROSSPIETSCH, KE
    [J]. MICROPROCESSING AND MICROPROGRAMMING, 1993, 38 (1-5): : 783 - 783
  • [5] Designing masking fault-tolerance via nonmasking fault-tolerance
    Arora, A
    Kulkarni, SS
    [J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1998, 24 (06) : 435 - 450
  • [6] ORGANIZATION OF TASK ALLOCATION IN COMPUTING SYSTEMS THAT ENSURES THEIR FAULT-TOLERANCE
    TURUTA, EN
    [J]. AVTOMATIKA I VYCHISLITELNAYA TEKHNIKA, 1985, (01): : 5 - 14
  • [7] Application-Level Fault-Tolerance Solutions for Grid Computing
    Diaz, Daniel
    Pardo, Xoan C.
    Martin, Maria J.
    Gonzalez, Patricia
    [J]. CCGRID 2008: EIGHTH IEEE INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, VOLS 1 AND 2, PROCEEDINGS, 2008, : 554 - 559
  • [8] Computing Graph Spanners in Small Memory: Fault-Tolerance and Streaming
    Ausiello, Giorgio
    Franciosa, Paolo G.
    Italiano, Giuseppe F.
    Ribichini, Andrea
    [J]. COMPUTING AND COMBINATORICS, 2010, 6196 : 160 - +
  • [9] Engineering Adaptive Fault-Tolerance Mechanisms for Resilient Computing on ROS
    Lauer, Michael
    Amy, Matthieu
    Fabre, Jean-Charles
    Roy, Matthieu
    Excoffon, William
    Stoicescu, Miruna
    [J]. 2016 IEEE 17TH INTERNATIONAL SYMPOSIUM ON HIGH ASSURANCE SYSTEMS ENGINEERING (HASE), 2016, : 94 - 101
  • [10] COMPUTING GRAPH SPANNERS IN SMALL MEMORY: FAULT-TOLERANCE AND STREAMING
    Ausiello, Giorgio
    Ribichini, Andrea
    Franciosa, Paolo G.
    Italiano, Giuseppe F.
    [J]. DISCRETE MATHEMATICS ALGORITHMS AND APPLICATIONS, 2010, 2 (04) : 591 - 605