Reliable Distributed Real-time and Embedded Systems Through Safe Middleware Adaptation

被引:0
|
作者
Dabholkar, Akshay [1 ]
Dubey, Abhishek [1 ]
Gokhale, Aniruddha [1 ]
Karsai, Gabor [1 ]
Mahadevan, Nagabhushan [1 ]
机构
[1] Vanderbilt Univ, Dept EECS, Inst Software Integrated Syst, Nashville, TN 37235 USA
关键词
Middleware; Adaptation; Fault Tolerance; Real-time; Software Health Management; Profiling; FAULT-TOLERANCE;
D O I
10.1109/SRDS.2012.59
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Distributed real-time and embedded (DRE) systems are a class of real-time systems formed through a composition of predominantly legacy, closed and statically scheduled real-time subsystems, which comprise over-provisioned resources to deal with worst-case failure scenarios. The formation of the system-of-systems leads to a new range of faults that manifest at different granularities for which no statically defined fault tolerance scheme applies. Thus, dynamic and adaptive fault tolerance mechanisms are needed which must execute within the available resources without compromising the safety and timeliness of existing real-time tasks in the individual subsystems. To address these requirements, this paper describes a middleware solution called Safe Middleware Adaptation for Real-Time Fault Tolerance (SafeMAT), which opportunistically leverages the available slack in the over-provisioned resources of individual subsystems. SafeMAT comprises three primary artifacts: (1) a flexible and configurable distributed, runtime resource monitoring framework that can pinpoint in real-time the available slack in the system that is used in making dynamic and adaptive fault tolerance decisions; (2) a safe and resource-aware dynamic failure adaptation algorithm that enables efficient recovery from different granularities of failures within the available slack in the execution schedule while ensuring real-time constraints are not violated and resources are not overloaded; and (3) a framework that empirically validates the correctness of the dynamic mechanisms and the safety of the DRE system. Experimental results evaluating SafeMAT on an avionics application indicates that SafeMAT incurs only 9-15% runtime failover and 2-6% processor utilization overheads thereby providing safe and predictable failure adaptability in real-time.
引用
收藏
页码:362 / 371
页数:10
相关论文
共 50 条
  • [41] Adaptive resource management middleware in distributed real-time systems
    School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
    不详
    [J]. Dianzi Keji Diaxue Xuebao, 2008, 1 (101-104):
  • [42] Middleware for real-time distributed simulations
    McLean, T
    Fujimoto, R
    Fitzgibbons, B
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2004, 16 (15): : 1483 - 1501
  • [43] iLAND: An Enhanced Middleware for Real-Time Reconfiguration of Service Oriented Distributed Real-Time Systems
    Garcia Valls, Marisol
    Rodriguez Lopez, Iago
    Fernandez Villar, Laura
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2013, 9 (01) : 228 - 236
  • [44] Modularizing variability and scalability concerns in distributed real-time and embedded systems with modeling tools and component middleware
    Deng, Gan
    Schmidt, Douglas C.
    Gokhale, Aniruddha
    Nechypurenko, Andrey
    [J]. NINTH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT AND COMPONENT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2006, : 327 - 334
  • [45] Applying AOP and MDA to Middleware-based Distributed Real-time Embedded Systems Software Process
    Liu Jingyong
    Zhong Yong
    Zhang Lichen
    Chen Yong
    [J]. 2009 ASIA-PACIFIC CONFERENCE ON INFORMATION PROCESSING (APCIP 2009), VOL 1, PROCEEDINGS, 2009, : 270 - +
  • [46] Optimizing General-Purpose Software Instrumentation Middleware Performance for Distributed Real-time and Embedded Systems
    Feiock, Dennis C.
    Hill, James H.
    [J]. 2013 IEEE 16TH INTERNATIONAL SYMPOSIUM ON OBJECT/COMPONENT/SERVICE-ORIENTED REAL-TIME DISTRIBUTED COMPUTING (ISORC), 2013,
  • [47] TOWARDS PREDICTABLE AND RELIABLE DISTRIBUTED REAL-TIME SYSTEMS
    TOKUDA, H
    [J]. PROCEEDINGS : THE THIRTEENTH ANNUAL INTERNATIONAL COMPUTER SOFTWARE & APPLICATIONS CONFERENCE, 1989, : 437 - 438
  • [48] Implementing reliable distributed real-time systems with the Θ-model
    Hermant, Jean-Francois
    Widder, Josef
    [J]. PRINCIPLES OF DISTRIBUTED SYSTEMS, 2006, 3974 : 334 - +
  • [49] Engineering safe, real-time distributed control systems
    Croll, P
    Rudram, C
    Chambers, C
    Uchihira, N
    [J]. 24TH EUROMICRO CONFERENCE - PROCEEDING, VOLS 1 AND 2, 1998, : 445 - 452
  • [50] Distributed priority inheritance for real-time and embedded systems
    Sanchez, Cesar
    Sipma, Henny B.
    Gill, Christopher D.
    Manna, Zohar
    [J]. PRINCIPLES OF DISTRIBUTED SYSTEMS, PROCEEDINGS, 2006, 4305 : 110 - 125