Cloud application deployment with transient failure recovery

被引:0
|
作者
Ioannis Giannakopoulos
Ioannis Konstantinou
Dimitrios Tsoumakos
Nectarios Koziris
机构
[1] Computing Systems Laboratory,
[2] School of ECE,undefined
[3] National Technical University of Athens,undefined
[4] Department of Informatics,undefined
[5] Ionian University,undefined
关键词
Cloud application deployment; Resource configuration; Transient failure; Error-recovery; Filesystem snapshot;
D O I
暂无
中图分类号
学科分类号
摘要
Application deployment is a crucial operation for modern cloud providers. The ability to dynamically allocate resources and deploy a new application instance based on a user-provided description in a fully automated manner is of great importance for the cloud users as it facilitates the generation of fully reproducible application environments with minimum effort. However, most modern deployment solutions do not consider the error-prone nature of the cloud: Network glitches, bad synchronization between different services and other software or infrastructure related failures with transient characteristics are frequently encountered. Even if these failures may be tolerable during an application’s lifetime, during the deployment phase they can cause severe errors and lead it to failure. In order to tackle this challenge, in this work we propose AURA, an open source system that enables cloud application deployment with transient failure recovery capabilities. AURA formulates the application deployment as a Directed Acyclic Graph. Whenever a transient failure occurs, it traverses the graph, identifies the parts of it that failed and re-executes the respective scripts, based on the fact that when the transient failure disappears the script execution will succeed. Moreover, in order to guarantee that each script execution is idempotent, AURA adopts a lightweight filesystem snapshot mechanism that aims at canceling the side effects of the failed scripts. Our thorough evaluation indicated that AURA is capable of deploying diverse real-world applications to environments exhibiting high error probabilities, introducing a minimal time overhead, proportional to the failure probability of the deployment scripts.
引用
收藏
相关论文
共 50 条
  • [41] Prioritized failure recovery in communication networks and its transient analysis
    Sun, HR
    Han, JJ
    Levendel, H
    COMPUTER COMMUNICATIONS, 2003, 26 (09) : 939 - 949
  • [42] Minimizing Deployment Cost of Cloud-Based Web Application with Guaranteed QoS
    Mireslami, Seyedehmehrnaz
    Rakai, Logan
    Wang, Mea
    Far, Behrouz Homayoun
    2015 IEEE GLOBAL COMMUNICATIONS CONFERENCE (GLOBECOM), 2015,
  • [43] Deployment of Multi-Tier Application on Cloud and Continuous Monitoring using Kubernetes
    Gupta, Manu
    Sanjana, Konte
    Akhilesh, Kontham
    Chowdary, Mandepudi Nobel
    2021 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONICS, COMMUNICATION, COMPUTER TECHNOLOGIES AND OPTIMIZATION TECHNIQUES (ICEECCOT), 2021, : 602 - 607
  • [44] Optimizing the Deployment of Cloud-hosted Application Components for Guaranteeing Multitenancy Isolation
    Ochei, Laud Charles
    Petrovski, Andrei
    Bass, Julian M.
    INTERNATIONAL CONFERENCE ON INFORMATION SOCIETY (I-SOCIETY 2016), 2016, : 77 - 83
  • [45] The Deployment of E-Learning Application as a Web Service in a Cloud Broker Architecture
    Zorgati, Rihem
    Hassen, Hamdi
    Alsulbi, Khlil Ahmad
    ADVANCED INFORMATION NETWORKING AND APPLICATIONS, VOL 4, AINA 2024, 2024, 202 : 1 - 12
  • [46] TRANSIENT RECOVERY VOLTAGE APPLICATION OF POWER CIRCUIT BREAKERS
    COLCLASER, RG
    IEEE TRANSACTIONS ON POWER APPARATUS AND SYSTEMS, 1972, 91 (05): : 1941 - +
  • [47] Recovering from Cloud Application Deployment Failures Through Re-execution
    Giannakopoulos, Ioannis
    Konstantinou, Ioannis
    Tsoumakos, Dimitrios
    Koziris, Nectarios
    ALGORITHMIC ASPECTS OF CLOUD COMPUTING, ALGOCLOUD 2016, 2017, 10230 : 117 - 130
  • [48] HPC CloudPills: on-demand deployment and execution of HPC application in cloud environments
    Ruiu, Pietro
    Terzo, Olivier
    Falzone, Alberto
    Maggi, Paolo
    Torterolo, Livia
    Usai, Enrico
    Carlino, Giuseppe
    Prandi, Rossella
    Perego, Gianpaolo
    2014 NINTH INTERNATIONAL CONFERENCE ON P2P, PARALLEL, GRID, CLOUD AND INTERNET COMPUTING (3PGCIC), 2014, : 82 - 88
  • [49] Application deployment using containers with auto-scaling for microservices in cloud environment
    Srirama, Satish Narayana
    Adhikari, Mainak
    Paul, Souvik
    JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2020, 160
  • [50] User arrival rate dependent profit maximisation of web application deployment on cloud
    Neelima N.
    Rao B.B.
    Rao K.G.
    Chandan K.
    International Journal of Cloud Computing, 2021, 10 (5-6) : 669 - 684