Cloud application deployment with transient failure recovery

被引:0
|
作者
Ioannis Giannakopoulos
Ioannis Konstantinou
Dimitrios Tsoumakos
Nectarios Koziris
机构
[1] Computing Systems Laboratory,
[2] School of ECE,undefined
[3] National Technical University of Athens,undefined
[4] Department of Informatics,undefined
[5] Ionian University,undefined
关键词
Cloud application deployment; Resource configuration; Transient failure; Error-recovery; Filesystem snapshot;
D O I
暂无
中图分类号
学科分类号
摘要
Application deployment is a crucial operation for modern cloud providers. The ability to dynamically allocate resources and deploy a new application instance based on a user-provided description in a fully automated manner is of great importance for the cloud users as it facilitates the generation of fully reproducible application environments with minimum effort. However, most modern deployment solutions do not consider the error-prone nature of the cloud: Network glitches, bad synchronization between different services and other software or infrastructure related failures with transient characteristics are frequently encountered. Even if these failures may be tolerable during an application’s lifetime, during the deployment phase they can cause severe errors and lead it to failure. In order to tackle this challenge, in this work we propose AURA, an open source system that enables cloud application deployment with transient failure recovery capabilities. AURA formulates the application deployment as a Directed Acyclic Graph. Whenever a transient failure occurs, it traverses the graph, identifies the parts of it that failed and re-executes the respective scripts, based on the fact that when the transient failure disappears the script execution will succeed. Moreover, in order to guarantee that each script execution is idempotent, AURA adopts a lightweight filesystem snapshot mechanism that aims at canceling the side effects of the failed scripts. Our thorough evaluation indicated that AURA is capable of deploying diverse real-world applications to environments exhibiting high error probabilities, introducing a minimal time overhead, proportional to the failure probability of the deployment scripts.
引用
收藏
相关论文
共 50 条
  • [21] Orchestrated multi-cloud application deployment in OpenStack with TOSCA
    Tricomi, Giuseppe
    Panarello, Alfonso
    Merlino, Giovanni
    Longo, Francesco
    Bruneo, Dario
    Puliafito, Antonio
    2017 IEEE INTERNATIONAL CONFERENCE ON SMART COMPUTING (SMARTCOMP), 2017, : 469 - 474
  • [22] Failure or Denial of Service? A Rethink of the Cloud Recovery Model
    Abdulazeez, Muhammed Bello
    Kowalski, Dariusz
    Lisista, Alexei
    Alshamrani, Sultan
    PROCEEDINGS OF THE 15TH EUROPEAN CONFERENCE ON CYBER WARFARE AND SECURITY (ECCWS 2016), 2016, : 1 - 8
  • [23] Cloud resource scheduling algorithm with failure recovery mechanism
    Qi, Ping
    Li, Long-Shu
    Li, Xue-Jun
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2015, 49 (12): : 2305 - 2315
  • [24] Transient analysis for prioritized failure recovery in communication networks
    Sun, HR
    Han, JJ
    Levendel, H
    CONFERENCE PROCEEDINGS OF THE 2002 IEEE INTERNATIONAL PERFORMANCE, COMPUTING, AND COMMUNICATIONS CONFERENCE, 2002, : 213 - 219
  • [25] CYCLONE: The Multi-Cloud Middleware Stack for Application Deployment and Management
    Slawik, Mathias
    Demchenko, Yuri
    Turkmen, Fatih
    Ilyushkin, Alexy
    de laat, Cees
    Blanchet, Christophe
    Loomis, Charles
    2017 9TH IEEE INTERNATIONAL CONFERENCE ON CLOUD COMPUTING TECHNOLOGY AND SCIENCE (CLOUDCOM), 2017, : 347 - 352
  • [26] A study on virtual machine deployment for application outsourcing in mobile cloud computing
    Muhammad Shiraz
    Saeid Abolfazli
    Zohreh Sanaei
    Abdullah Gani
    The Journal of Supercomputing, 2013, 63 : 946 - 964
  • [27] OptiSpot: minimizing application deployment cost using spot cloud resources
    Dubois, Daniel J.
    Casale, Giuliano
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2016, 19 (02): : 893 - 909
  • [28] Automatic deployment system of computer program application based on cloud computing
    Hui Zhai
    Jia Wang
    International Journal of System Assurance Engineering and Management, 2021, 12 : 731 - 740
  • [29] Automatic Migration and Deployment of Cloud Services for Healthcare Application Development in FIWARE
    Sotiriadis, Stelios
    Vakanas, Lenos
    Petrakis, Euripides
    Zampognaro, Paolo
    Bessis, Nik
    IEEE 30TH INTERNATIONAL CONFERENCE ON ADVANCED INFORMATION NETWORKING AND APPLICATIONS WORKSHOPS (WAINA 2016), 2016, : 416 - 419
  • [30] SERRANO: Transparent Application Deployment in a Secure, Accelerated and Cognitive Cloud Continuum
    Kretsis, Aristotelis
    Kokkinos, Panagiotis
    Soumplis, Polyzois
    Olmos, Juan Jose Vegas
    Feher, Marcell
    Sipos, Marton
    Lucani, Daniel E.
    Khabi, Dmitry
    Masouros, Dimosthenis
    Siozios, Kostas
    Bourgos, Paraskevas
    Tsekeridou, Sofia
    Zyulkyarov, Ferad
    Karanastasis, Efstathios
    Chondrogiannis, Efthymios
    Andronikou, Vassiliki
    Fernandez Gomez, Aitor
    Panica, Silviu
    Iuhasz, Gabriel
    Nanos, Anastassios
    Chalios, Charalampos
    Varvarigos, Manos
    2021 IEEE INTERNATIONAL MEDITERRANEAN CONFERENCE ON COMMUNICATIONS AND NETWORKING (IEEE MEDITCOM 2021), 2021, : 55 - 60