Edge Workloads Monitoring and Failover: a StarlingX-Based Testbed Implementation and Measurement Study

被引:0
|
作者
Abuibaid, Mohammed [1 ]
Ghorab, Amir Hossein [1 ]
Seguin-Mcpeake, Aidan [2 ]
Yuen, Owen [2 ]
Yungblut, Thomas [2 ]
St-Hilaire, Marc [1 ,2 ]
机构
[1] Carleton Univ, Dept Syst & Comp Engn, Ottawa, ON K1S 5B6, Canada
[2] Carleton Univ, Sch Informat Technol, Ottawa, ON K1S 5B6, Canada
关键词
Cloud computing; Internet of Things; Monitoring; Edge computing; Scalability; Distributed computing; Collaboration; Failure analysis; Distributed cloud infrastructure; edge computing; failover; IoT; Kubernetes; microservice architecture; StarlingX platform; testbed;
D O I
10.1109/ACCESS.2022.3204976
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the ever-growing amount of time-critical, compute-intensive, and private IoT applications, the need for High Availability (HA) Edge Clouds becomes indispensable. Realizing HA Edge Clouds is inherently challenging due to the geographically-dispersed hierarchy of the Distributed Cloud Infrastructure (DCI). For example, frequent isolation between the central Cloud and Edge Clouds due to networking instability necessitates some autonomous operations at the Edge Clouds. Furthermore, because Edge Clouds have fewer resources than central Clouds, configuring the Edge functions (i.e., control, compute, and storage) in HA clusters will undoubtedly reduce downtime. However, it will limit the Edge scalability. To that end, StarlingX is developing an HA-protected and scalable DCI virtualization platform based on the open-source ecosystem, focusing on low-touch management of Edge Clouds. StarlingX provides a fault management service that realizes DCI-wide alarming and logging capabilities, allowing for rapid response to virtualized infrastructure events. Recently, the IETF Network Working Group proposed that monitoring both the DCI and the Edge workloads (software containers) is critical for an Edge Computing Platform to maintain HA IoT application deployment. Indeed, the possibility of the infrastructure remaining stable and healthy while the workloads suffer a fatal failure simultaneously necessitates failover functionality that monitors both the infrastructure and the Edge workloads. In this paper, we first propose a dynamic failover functionality that centrally monitors Edge workloads to recover from deployment or Edge node failures, motivated by the IETF direction. Second, we experimentally optimize the failover functionality for monitoring a microservice-architected IoT application deployed on a StarlingX-based DCI testbed to collect temperature sensor readings from Raspberry Pis. Regardless of how quickly the Edge workload health checks are collected, the recorded failover measurements reveal that the recovery time will not drop below a predetermined level controlled by Edge resources and network speed. Furthermore, reducing the statistics collection timeout reduces the recovery time of an Edge node failure. When the timeout value is less than the minimum achievable recovery time, false-positive failures (FPFs) can occur. Third, to supplement the StarlingX fault management service, we provide a modular implementation of the proposed failover functionality. Finally, we present the first-ever introduction of the StarlingX platform's software stack to promote its use in academic research.
引用
收藏
页码:97101 / 97116
页数:16
相关论文
共 50 条
  • [31] Implementation of a Multi-functional Ambulatory Urodynamics Monitoring System Based on Newly Devised Abdominal Pressure Measurement
    Kim, Keo Sik
    Seo, Jeong Hwan
    Kang, Jin U.
    Song, Chul Gyu
    JOURNAL OF MEDICAL SYSTEMS, 2010, 34 (06) : 1011 - 1021
  • [32] Implementation of a Multi-functional Ambulatory Urodynamics Monitoring System Based on Newly Devised Abdominal Pressure Measurement
    Keo Sik Kim
    Jeong Hwan Seo
    Jin U. Kang
    Chul Gyu Song
    Journal of Medical Systems, 2010, 34 : 1011 - 1021
  • [33] The FLEX study school-based physical activity programs - measurement and evaluation of implementation
    Wright, Catherine M.
    Chomitz, Virginia R.
    Duquesnay, Paula J.
    Amin, Sarah A.
    Economos, Christina D.
    Sacheck, Jennifer M.
    BMC PUBLIC HEALTH, 2019, 19 (1)
  • [34] The FLEX study school-based physical activity programs – measurement and evaluation of implementation
    Catherine M. Wright
    Virginia R. Chomitz
    Paula J. Duquesnay
    Sarah A. Amin
    Christina D. Economos
    Jennifer M. Sacheck
    BMC Public Health, 19
  • [35] Monitoring and characterization of Metal-over-Contact based edge-contour extraction measurement followed by electrical simulation
    Shauly, Eitan
    Rotstein, Israel
    Schwarzband, Ishai
    Edan, Ofer
    Levi, Shimon
    METROLOGY, INSPECTION, AND PROCESS CONTROL FOR MICROLITHOGRAPHY XXIV, 2010, 7638
  • [36] A Fuzzy-Based Approach for the Assessment of the Edge Layer Processing Capability in SDN-VANETs: A Comparation Study of Testbed and Simulation System Results
    Qafzezi, Ermioni
    Bylykbashi, Kevin
    Higashi, Shunya
    Ampririt, Phudit
    Matsuo, Keita
    Barolli, Leonard
    VEHICLES, 2023, 5 (03): : 1087 - 1103
  • [37] Methods proposed for monitoring the implementation of evidence-based research: a cross-sectional study
    Puljak, Livia
    Bala, Malgorzata M.
    Zajac, Joanna
    Mestrovic, Tomislav
    Buttigieg, Sandra
    Yanakoulia, Mary
    Briel, Matthias
    Lunny, Carole
    Lesniak, Wiktoria
    Pericic, Tina Poklepovic
    Alonso-Coello, Pablo
    Clarke, Mike
    Djulbegovic, Benjamin
    Gartlehner, Gerald
    Giannakou, Konstantinos
    Glenny, Anne-Marie
    Glenton, Claire
    Guyatt, Gordon
    Hemkens, Lars G.
    Ioannidis, John P. A.
    Jaeschke, Roman
    Jorgensen, Karsten Juhl
    Martins-Pfeifer, Carolina Castro
    Marusic, Ana
    Mbuagbaw, Lawrence
    Echavez, Jose Francisco Meneses
    Moher, David
    Nussbaumer-Streit, Barbara
    Page, Matthew J.
    Perez-Gaxiola, Giordano
    Robinson, Karen A.
    Salanti, Georgia
    Saldanha, Ian J.
    Savovic, Jelena
    Thomas, James
    Tricco, Andrea C.
    Tugwell, Peter
    van Hoof, Joost
    Pieper, Dawid
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2024, 168
  • [38] Effects of teacher self-monitoring on implementation of curriculum-based measurement and mathematics computation achievement of students with disabilities
    Allinder, RM
    Bolling, RM
    Oats, RG
    Gagnon, WA
    REMEDIAL AND SPECIAL EDUCATION, 2000, 21 (04) : 219 - 226
  • [39] Automated Monitoring of Adherence to Evidenced-Based Clinical Guideline Recommendations: Design and Implementation Study
    Lichtner, Gregor
    Spies, Claudia
    Jurth, Carlo
    Bienert, Thomas
    Mueller, Anika
    Kumpf, Oliver
    Piechotta, Vanessa
    Skoetz, Nicole
    Nothacker, Monika
    Boeker, Martin
    Meerpohl, Joerg J.
    von Dincklage, Falk
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2023, 25
  • [40] Study on design and implementation of web-based audience measurement platform for digital signage service
    Hyun, Wook
    Huh, MiYoung
    Kim, SungHei
    Kang, ShinGak
    2015 17TH INTERNATIONAL CONFERENCE ON ADVANCED COMMUNICATION TECHNOLOGY (ICACT), 2015,