FRAME: Fault Tolerant and Real-Time Messaging for Edge Computing

被引:13
|
作者
Wang, Chao [1 ]
Gill, Christopher [1 ]
Lu, Chenyang [1 ]
机构
[1] Washington Univ, Dept Comp Sci & Engn, St Louis, MO 14263 USA
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICDCS.2019.00101
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Edge computing systems for Industrial Internet of Things (IIoT) applications require reliable and timely message delivery. Both latency discrepancies within edge clouds, and heterogeneous loss-tolerance and latency requirements pose new challenges for proper quality of service differentiation. Efficient differentiated edge computing architectures are also needed, especially when common fault-tolerant mechanisms tend to introduce additional latency, and when cloud traffic may impede local, time-sensitive message delivery. In this paper, we introduce FRAME, a fault-tolerant real-time messaging architecture. We first develop timing bounds that capture the relation between traffic/service parameters and loss-tolerance/latency requirements, and then illustrate how such bounds can support proper differentiation in a representative IIoT scenario. Specifically, FRAME leverages those timing bounds to schedule message delivery and replication actions to meet needed levels of assurance. FRAME is implemented on top of the TAO real-time event service, and we present empirical evaluations in a local edge computing test-bed and an Amazon Virtual Private Cloud. The results of those evaluations show that FRAME can efficiently meet different levels of message loss-tolerance requirements, mitigate latency penalties caused by fault recovery, and meet end-to-end soft deadlines during normal, fault-free operation.
引用
收藏
页码:976 / 985
页数:10
相关论文
共 50 条
  • [1] On the progress in fault-tolerant real-time computing
    Ezhilchelvan, P
    Ezhilchelvan, P
    [J]. 23RD IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2004, : 103 - 105
  • [2] Fault-tolerant real-time communication in distributed computing systems
    Zheng, Q
    Shin, KG
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (05) : 470 - 480
  • [3] Slow advances in fault-tolerant real-time distributed computing
    Kim, KHK
    [J]. 23RD IEEE INTERNATIONAL SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2004, : 106 - 108
  • [4] Reconciling fault-tolerant distributed algorithms and real-time computing
    Heinrich Moser
    Ulrich Schmid
    [J]. Distributed Computing, 2014, 27 : 203 - 230
  • [5] Reconciling fault-tolerant distributed algorithms and real-time computing
    Moser, Heinrich
    Schmid, Ulrich
    [J]. DISTRIBUTED COMPUTING, 2014, 27 (03) : 203 - 230
  • [6] Real-Time Fault-Tolerant Computing with Machine Learning Enhancements
    Yin, Meng-Lai
    Aroush, Hovig
    [J]. 2023 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, RAMS, 2023,
  • [7] Efficient Real-Time Continuous Classification Learning and Fault-Tolerant Computing
    Yin, Meng-Lai
    [J]. 2024 ANNUAL RELIABILITY AND MAINTAINABILITY SYMPOSIUM, RAMS, 2024,
  • [8] A novel single-channel edge computing LoRa gateway for real-time confirmed messaging
    Zhong, Chen
    Nie, Xianzhong
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01)
  • [9] Real-Time and Robust Hydraulic System Fault Detection via Edge Computing
    Fawwaz, Dzaky Zakiyal
    Chung, Sang-Hwa
    [J]. APPLIED SCIENCES-BASEL, 2020, 10 (17):
  • [10] Fault-tolerant real-time objects
    Kim, KH
    Subbaraman, C
    [J]. COMMUNICATIONS OF THE ACM, 1997, 40 (01) : 75 - 82