Restart-Based Fault-Tolerance: System Design and Schedulability Analysis

被引：0

作者：

Abdi, Fardin ^{[1
]}

Mancuso, Renato ^{[1
]}

Tabish, Rohan ^{[1
]}

Caccamo, Marco ^{[1
]}

机构：

[1] Univ Illinois, Dept Comp Sci, 1304 W Springfield Ave, Urbana, IL 61801 USA

来源：

2017 IEEE 23RD INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS (RTCSA) | 2017年

基金：

美国国家科学基金会;

关键词：

REAL-TIME SYSTEMS; PERIODIC TASKS;

D O I：

暂无

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models.

引用

下载

页数：10

共 50 条

[41] Memshepherd: comprehensive memory bug fault-tolerance system
Zou, Deqing
Zheng, Weide
Jiang, Wenbin
Jin, Hai
Chen, Gang
SECURITY AND COMMUNICATION NETWORKS, 2014, 7 (09) : 1412 - 1419
[42] A METHOD TO DETERMINE THE LEVEL OF THE INFORMATION SYSTEM FAULT-TOLERANCE
Boranbayev, A. S.
Boranbayev, S. N.
Nurusheva, A. M.
Seitkulov, Y. N.
Sissenov, N. M.
EURASIAN JOURNAL OF MATHEMATICAL AND COMPUTER APPLICATIONS, 2019, 7 (03): : 13 - 32
[43] Fault-tolerance in a distributed management system: a case study
Smeikal, R
Goeschka, KM
25TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, PROCEEDINGS, 2003, : 478 - 483
[44] FAULT-TOLERANCE IN A MULTIPROCESSOR, DIGITAL SWITCHING-SYSTEM
DE, BB
KRAKAU, HB
IEEE TRANSACTIONS ON RELIABILITY, 1981, 30 (03) : 246 - 252
[45] Design of the Directory Facilitator Supporting Fault-Tolerance in Multi-OSGi Agent System
Ryu, Sang-Hwan
Lee, Seung-Hyun
Jang, Kyung-Soo
Shin, Ho-Jin
Shin, Dong-Ryeol
COMPUTATIONAL COLLECTIVE INTELLIGENCE: SEMANTIC WEB, SOCIAL NETWORKS AND MULTIAGENT SYSTEMS, 2009, 5796 : 183 - 192
[46] Structural Health Monitoring System Considering Fault-tolerance
Matsushiba, Yoshinao
Nishi, Hiroaki
2008 6TH IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS, VOLS 1-3, 2008, : 304 - 309
[47] Web services system supporting quality fault-tolerance
Lee, Y
Oh, J
Han, SY
International Conference on Next Generation Web Services Practices, 2005, : 452 - 453
[48] A lightweight software fault-tolerance system in the cloud environment
Chen, Gang
Jin, Hai
Zou, Deqing
Zhou, Bing Bing
Qiang, Weizhong
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (12): : 2982 - 2998
[49] Fault-tolerance in the borealis distributed stream processing system
Balazinska, Magdalena
Balakrishnan, Hari
Madden, Samuel R.
Stonebraker, Michael
ACM TRANSACTIONS ON DATABASE SYSTEMS, 2008, 33 (01):
[50] Comprehensive Analysis of Performance, Fault-tolerance and Scalability in Grid Resource Management System
Kong, Xiangzhen
Huang, Jiwei
Lin, Chuang
2009 EIGHTH INTERNATIONAL CONFERENCE ON GRID AND COOPERATIVE COMPUTING, PROCEEDINGS, 2009, : 83 - 90

← 1 2 3 4 5 →