Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

被引:0
|
作者
Abdi, Fardin [1 ]
Mancuso, Renato [1 ]
Tabish, Rohan [1 ]
Caccamo, Marco [1 ]
机构
[1] Univ Illinois, Dept Comp Sci, 1304 W Springfield Ave, Urbana, IL 61801 USA
基金
美国国家科学基金会;
关键词
REAL-TIME SYSTEMS; PERIODIC TASKS;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models.
引用
下载
收藏
页数:10
相关论文
共 50 条
  • [41] Memshepherd: comprehensive memory bug fault-tolerance system
    Zou, Deqing
    Zheng, Weide
    Jiang, Wenbin
    Jin, Hai
    Chen, Gang
    SECURITY AND COMMUNICATION NETWORKS, 2014, 7 (09) : 1412 - 1419
  • [42] A METHOD TO DETERMINE THE LEVEL OF THE INFORMATION SYSTEM FAULT-TOLERANCE
    Boranbayev, A. S.
    Boranbayev, S. N.
    Nurusheva, A. M.
    Seitkulov, Y. N.
    Sissenov, N. M.
    EURASIAN JOURNAL OF MATHEMATICAL AND COMPUTER APPLICATIONS, 2019, 7 (03): : 13 - 32
  • [43] Fault-tolerance in a distributed management system: a case study
    Smeikal, R
    Goeschka, KM
    25TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2003, : 478 - 483
  • [44] FAULT-TOLERANCE IN A MULTIPROCESSOR, DIGITAL SWITCHING-SYSTEM
    DE, BB
    KRAKAU, HB
    IEEE TRANSACTIONS ON RELIABILITY, 1981, 30 (03) : 246 - 252
  • [45] Design of the Directory Facilitator Supporting Fault-Tolerance in Multi-OSGi Agent System
    Ryu, Sang-Hwan
    Lee, Seung-Hyun
    Jang, Kyung-Soo
    Shin, Ho-Jin
    Shin, Dong-Ryeol
    COMPUTATIONAL COLLECTIVE INTELLIGENCE: SEMANTIC WEB, SOCIAL NETWORKS AND MULTIAGENT SYSTEMS, 2009, 5796 : 183 - 192
  • [46] Structural Health Monitoring System Considering Fault-tolerance
    Matsushiba, Yoshinao
    Nishi, Hiroaki
    2008 6TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, VOLS 1-3, 2008, : 304 - 309
  • [47] Web services system supporting quality fault-tolerance
    Lee, Y
    Oh, J
    Han, SY
    International Conference on Next Generation Web Services Practices, 2005, : 452 - 453
  • [48] A lightweight software fault-tolerance system in the cloud environment
    Chen, Gang
    Jin, Hai
    Zou, Deqing
    Zhou, Bing Bing
    Qiang, Weizhong
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (12): : 2982 - 2998
  • [49] Fault-tolerance in the borealis distributed stream processing system
    Balazinska, Magdalena
    Balakrishnan, Hari
    Madden, Samuel R.
    Stonebraker, Michael
    ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (01):
  • [50] Comprehensive Analysis of Performance, Fault-tolerance and Scalability in Grid Resource Management System
    Kong, Xiangzhen
    Huang, Jiwei
    Lin, Chuang
    2009 EIGHTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2009, : 83 - 90