Software fault detection and recovery in critical real-time systems: An approach based on loose coupling

被引:6
|
作者
Alho, Pekka [1 ]
Mattila, Jouni [1 ]
机构
[1] Tampere Univ Technol, Dept Intelligent Hydraul & Automat, FIN-33101 Tampere, Finland
关键词
ITER; Remote handling; Software; Fault tolerance; Dependability; Real-time; TOLERANCE;
D O I
10.1016/j.fusengdes.2014.04.050
中图分类号
TL [原子能技术]; O571 [原子核物理学];
学科分类号
0827 ; 082701 ;
摘要
Remote handling (RH) systems are used to inspect, make changes to, and maintain components in the ITER machine and as such are an example of mission-critical system. Failure in a critical system may cause damage, significant financial losses and loss of experiment runtime, making dependability one of their most important properties. However, even if the software for RH control systems has been developed using best practices, the system might still fail due to undetected faults (bugs), hardware failures, etc. Critical systems therefore need capability to tolerate faults and resume operation after their occurrence. However, design of effective fault detection and recovery mechanisms poses a challenge due to timeliness requirements, growth in scale, and complex interactions. In this paper we evaluate effectiveness of service-oriented architectural approach to fault tolerance in mission-critical real-time systems. We use a prototype implementation for service management with an experimental RH control system and industrial manipulator. The fault tolerance is based on using the high level of decoupling between services to recover from transient faults by service restarts. In case the recovery process is not successful, the system can still be used if the fault was not in a critical software module. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:2272 / 2277
页数:6
相关论文
共 50 条
  • [41] Checkpointing for the reliability of real-time systems with on-line fault detection
    Ryu, SM
    Park, DJ
    EMBEDDED AND UBIQUITOUS COMPUTING - EUC 2005, 2005, 3824 : 194 - 202
  • [42] Developing component-based software for real-time systems
    Zalewski, J
    PROCEEDINGS OF THE 27TH EUROMICRO CONFERENCE - 2001: A NET ODYSSEY, 2001, : 80 - 87
  • [43] A software framework based on real-time CORBA for telerobotic systems
    Bottazzi, S
    Caselli, S
    Reggiani, M
    Amoretti, M
    2002 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-3, PROCEEDINGS, 2002, : 3011 - 3017
  • [44] DISTRIBUTED FAULT-TOLERANT REAL-TIME SYSTEMS - THE MARS APPROACH
    KOPETZ, H
    DAMM, A
    KOZA, C
    MULAZZANI, M
    SCHWABL, W
    SENFT, C
    ZAINLINGER, R
    IEEE MICRO, 1989, 9 (01) : 25 - 40
  • [45] A formal software synthesis approach for embedded hard real-time systems
    Barreto, R
    Oliveira, M
    Tavares, E
    Neves, M
    Maciel, P
    Lima, R
    SBCCI2004:17TH SYMPOSIUM ON INTEGRATED CIRCUITS AND SYSTEMS DESIGN, PROCEEDINGS, 2004, : 163 - 168
  • [46] Real-time and fault tolerance in distributed control software
    Orlic, B
    Broenink, JF
    COMMUNICATING PROCESS ARCHITECTURES 2003, 2003, 61 : 235 - 250
  • [47] Power saving and fault-tolerance in real-time critical embedded systems
    Santos, Rodrigo M.
    Santos, Jorge
    Orozco, Javier D.
    JOURNAL OF SYSTEMS ARCHITECTURE, 2009, 55 (02) : 90 - 101
  • [48] FAULT-TOLERANT SOFTWARE FOR REAL-TIME APPLICATIONS
    HECHT, H
    COMPUTING SURVEYS, 1976, 8 (04) : 391 - 407
  • [49] Underlying the performance of real-time software-based pipeline leak-detection systems
    Al-Rafai, W.
    Barnes, R.J.
    Pipes and Pipelines International, 1999, 44 (06): : 44 - 51
  • [50] Building safety-critical real-time systems with synchronous software components
    Gunzert, M
    REAL TIME PROGRAMMING 1999 (WRTP'99), 1999, : 63 - 68