Software fault detection and recovery in critical real-time systems: An approach based on loose coupling

被引:6
|
作者
Alho, Pekka [1 ]
Mattila, Jouni [1 ]
机构
[1] Tampere Univ Technol, Dept Intelligent Hydraul & Automat, FIN-33101 Tampere, Finland
关键词
ITER; Remote handling; Software; Fault tolerance; Dependability; Real-time; TOLERANCE;
D O I
10.1016/j.fusengdes.2014.04.050
中图分类号
TL [原子能技术]; O571 [原子核物理学];
学科分类号
0827 ; 082701 ;
摘要
Remote handling (RH) systems are used to inspect, make changes to, and maintain components in the ITER machine and as such are an example of mission-critical system. Failure in a critical system may cause damage, significant financial losses and loss of experiment runtime, making dependability one of their most important properties. However, even if the software for RH control systems has been developed using best practices, the system might still fail due to undetected faults (bugs), hardware failures, etc. Critical systems therefore need capability to tolerate faults and resume operation after their occurrence. However, design of effective fault detection and recovery mechanisms poses a challenge due to timeliness requirements, growth in scale, and complex interactions. In this paper we evaluate effectiveness of service-oriented architectural approach to fault tolerance in mission-critical real-time systems. We use a prototype implementation for service management with an experimental RH control system and industrial manipulator. The fault tolerance is based on using the high level of decoupling between services to recover from transient faults by service restarts. In case the recovery process is not successful, the system can still be used if the fault was not in a critical software module. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:2272 / 2277
页数:6
相关论文
共 50 条
  • [1] A responsiveness approach for scheduling fault recovery in real-time systems
    Mejia-Alvarez, P
    Mossé, D
    PROCEEDINGS OF THE FIFTH IEEE REAL-TIME TECHNOLOGY AND APPLICATIONS SYMPOSIUM, 1999, : 4 - 13
  • [2] SOFTWARE FAULT TOLERANCE IN REAL-TIME SYSTEMS
    KANT, K
    INFORMATION SCIENCES, 1987, 42 (03) : 255 - 282
  • [3] FAULT DETECTION IN FLUID SYSTEMS: AN INTERACTIVE REAL-TIME APPROACH
    Angeli, C.
    Chatzinikolaou, A.
    8TH INTERNATIONAL INDUSTRIAL SIMULATION CONFERENCE 2010, ISC 2010, 2010, : 223 - 227
  • [4] An approach to automatic detection of software failures in real-time systems
    Savor, T
    Seviora, RE
    THIRD IEEE REAL-TIME TECHNOLOGY AND APPLICATIONS SYMPOSIUM, PROCEEDINGS, 1997, : 136 - 146
  • [5] CRITICAL ISSUES IN REAL-TIME SOFTWARE SYSTEMS
    AOYAMA, M
    PROCEEDINGS : THE THIRTEENTH ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE, 1989, : 434 - 435
  • [6] Real-time fault detection approach of software under big data environment
    Jian Xianrui
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTOMATION, MECHANICAL CONTROL AND COMPUTATIONAL ENGINEERING, 2015, 124 : 1215 - 1220
  • [7] Fault recovery based on checkpointing for hard real-time embedded systems
    Zhang, Y
    Chakrabarty, K
    18TH IEEE INTERNATIONAL SYMPOSIUM ON DEFECT AND FAULT TOLERANCE IN VLSI SYSTEMS, PROCEEDINGS, 2003, : 320 - 327
  • [8] MICROCOMPUTER REAL-TIME SOFTWARE-RELIABILITY AND FAULT RECOVERY
    LOMBARDI, F
    MICROELECTRONICS AND RELIABILITY, 1982, 22 (04): : 693 - 697
  • [9] A FRAMEWORK FOR SOFTWARE FAULT TOLERANCE IN REAL-TIME SYSTEMS
    ANDERSON, T
    KNIGHT, JC
    IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1983, 9 (03) : 355 - 364
  • [10] A Real-Time Configuration Approach for an Observer-Based Residual Generator of Fault Detection Systems
    Zhao, Hao
    Luo, Hao
    Liu, Tianyu
    PROCESSES, 2022, 10 (02)