Exploiting Redundancies to Enhance Schedulability in Fault-Tolerant and Real-Time Distributed Systems

被引:26
|
作者
Luo, Wei [1 ]
Qin, Xiao [2 ]
Tan, Xian-Chun [1 ]
Qin, Ke [1 ]
Manzanares, Adam [2 ]
机构
[1] China Ship Dev & Design Ctr, Dept Informat Syst, Wuhan 430064, Peoples R China
[2] Auburn Univ, Dept Comp Sci & Software Engn, Auburn, AL 36849 USA
基金
美国国家科学基金会; 中国国家自然科学基金;
关键词
Distributed systems; fault tolerance; rate-monotonic (RM) algorithm; real-time task scheduling; primary-backup copy; SCHEDULING ALGORITHM; APERIODIC TASKS; ASSIGNMENT; RECOVERY; SOFT;
D O I
10.1109/TSMCA.2009.2013192
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In the past decades, distributed systems have been widely applied to real-time applications, most of which have fault-tolerance requirements to assure high reliability. Due to the stringent space constraints of real-time systems, the issue of schedulability becomes a major concern in the design of fault-tolerant and real-time distributed systems. Most existing real-time and fault-tolerant scheduling algorithms, which are based on the primary-backup scheme for periodic real-time tasks, introduce unnecessary redundancies by aggressively using active-backup copies. To solve this problem, we propose two novel fault-tolerant techniques, which are seamlessly integrated with fixed-priority-based scheduling algorithms. These techniques leverage redundancies to enhance schedulability in fault-tolerant and real-time distributed systems. Our fault-tolerant techniques make use of the primary-backup scheme to tolerate permanent hardware failures. The first technique (referred to as Tercos) terminates the execution of active-backup copies, when corresponding primary copies are successfully completed. Tercos is designed to reduce scheduling lengths in fault-free scenarios to enhance schedulability by virtue of executing portions of active-backup copies in passive forms. The second technique (referred to as Debus) uses a deferred-active-backup scheme to further minimize schedule lengths to improve the schedulability performance. Debus schedules active-backup copies as late as possible, while terminating active-backup copies when their primary copies are completed. Experimental results show that, compared with existing algorithms in literature, Tercos can significantly improve schedulability by up to 17.0% (with an average of 9.7%). Furthermore, empirical results reveal that Debus can enhance schedulability over Tercos by up to 12% (with an average of 7.8%).
引用
收藏
页码:626 / 639
页数:14
相关论文
共 50 条
  • [1] TERCOS: A novel technique for exploiting redundancies in fault-tolerant and real-time distributed systems
    Luo, Wei
    Yang, FuMin
    Tu, Gang
    Pang, LiPing
    Qin, Xiao
    13TH IEEE INTERNATIONAL CONFERENCE ON EMBEDDED AND REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2007, : 275 - +
  • [2] A feasible schedulability analysis for fault-tolerant hard real-time systems
    Jun, L
    Yang, FM
    Lu, YS
    ICECCS 2005: 10TH IEEE INTERNATIONAL CONFERENCE ON ENGINEERING OF COMPLEX COMPUTER SYSTEMS, PROCEEDINGS, 2005, : 176 - 183
  • [3] An effective schedulability analysis for fault-tolerant hard real-time systems
    Lima, GMD
    Burns, A
    13TH EUROMICRO CONFERENCE ON REAL-TIME SYSTEMS, PROCEEDINGS, 2001, : 209 - 216
  • [4] Fault-tolerant scheduling in distributed real-time systems
    Satyanarayana, NV
    Mall, R
    Pal, A
    2001 INTERNATIONAL CONFERENCE ON COMPUTER NETWORKS AND MOBILE COMPUTING, PROCEEDINGS, 2001, : 275 - 280
  • [5] Fault-tolerant scheduling in distributed real-time systems
    Thai, ND
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2004, 3019 : 125 - 130
  • [6] Holistic schedulability analysis of a fault-tolerant real-time distributed run-time support
    Chevochot, P
    Puaut, I
    SEVENTH INTERNATIONAL CONFERENCE ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2000, : 355 - 362
  • [7] Fault-tolerant real-time communication in distributed computing systems
    Zheng, Q
    Shin, KG
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1998, 9 (05) : 470 - 480
  • [8] Distributed fault-tolerant avionic systems - A real-time perspective
    Audsley, NC
    Burke, M
    1998 IEEE AEROSPACE CONFERENCE PROCEEDINGS, VOL 4, 1998, : 43 - 60
  • [9] Real-time fault-tolerant scheduling in heterogeneous distributed systems
    Qin, X
    Han, ZF
    Pang, LP
    Li, SL
    Jin, H
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 421 - 427