A Fault-tolerance Framework for Distributed Component Systems

被引：1

作者：

Hamid, Brahim ^{[1
]}

Radermacher, Ansgar ^{[1
]}

Vanuxeem, Patrick ^{[1
]}

Lanusse, Agnes ^{[1
]}

Gerard, Sebastien ^{[1
]}

机构：

[1] CEA, LIST, Lab Ingn Dirigee Modeles Syst Embarques, F-91191 Gif Sur Yvette, France

来源：

PROCEEDINGS OF THE 34TH EUROMICRO CONFERENCE ON SOFTWARE ENGINEERING AND ADVANCED APPLICATIONS | 2008年

关键词：

Connector CORBA Component Model; Distributed applications; Failure detection; Fault tolerance; Middleware; Model-driven;

D O I：

10.1109/SEAA.2008.50

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

The requirement for higher reliability and availability of systems is continuously increasing even in domains not traditionally strongly concerned by such issues. Required solutions are expected to be efficient, flexible, reusable on rapidly evolving hardware and of course at low cost. Combining both model and component seems to be a very promising cocktail for building solutions to this problem. Hence, we will present in this paper an approach using a model as its first structural citizen all along the development process. Our proposal will be illustrated with an application modeled with UML (extended with some of its dedicated profiles). Our approach includes an underlying execution infrastructure/middleware, providing fault-tolerance services. For the component aspect, our framework promotes firstly an infrastructure based on the Component/Container/Connector paradigm to provide run-time facilities enabling transparent management of fault-tolerance (mainly fault-detection and redundancy mechanisms). For the model-driven point of view, our framework provides tool support for assisting the users to model their applications and to deploy and configure them on computing platforms. In this paper we focus on the run-time support offered by the component framework, specially the replication-aware interaction mechanism enabling a transparent replication management mechanisms and some additional system components dedicated to fault-detection and replicas management.

引用

页码：84 / 91

页数：8

共 50 条

[1] A framework for reconfiguration-based fault-tolerance in distributed systems
Porcarelli, S
Castaldi, M
Di Giandomenico, F
Bondavalli, A
Inverardi, P
[J]. ARCHITECTING DEPENDABLE SYSTEMS II, 2004, 3069 : 167 - 190
[2] MODELING OF HIERARCHICAL DISTRIBUTED SYSTEMS WITH FAULT-TOLERANCE
SHIEH, YB
GHOSAL, D
CHINTAMANENI, PR
TRIPATHI, SK
[J]. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, 1990, 16 (04) : 444 - 457
[3] A formal model for fault-tolerance in distributed systems
Hamid, B
Mosbah, M
[J]. COMPUTER SAFETY, RELIABILITY, AND SECURITY, PROCEEDINGS, 2005, 3688 : 108 - 121
[4] Automated analysis of fault-tolerance in distributed systems
Stoller, SD
Schneider, FB
[J]. FORMAL METHODS IN SYSTEM DESIGN, 2005, 26 (02) : 183 - 196
[5] Automated Analysis of Fault-Tolerance in Distributed Systems
Scott D. Stoller
Fred B. Schneider
[J]. Formal Methods in System Design, 2005, 26 : 183 - 196
[6] ON FAULT-TOLERANCE MECHANISMS IN DISTRIBUTED COMPUTER SYSTEMS.
Eberbach, Eugeniusz
Just, Jan R.
[J]. 1600, (16): : 4 - 5
[7] ON FAULT-TOLERANCE MECHANISMS IN DISTRIBUTED COMPUTER-SYSTEMS
EBERBACH, E
JUST, JR
[J]. MICROPROCESSING AND MICROPROGRAMMING, 1985, 16 (4-5): : 239 - 244
[8] AN EFFICIENT RECOVERY PROCEDURE FOR FAULT-TOLERANCE IN DISTRIBUTED SYSTEMS
SALEH, K
AHMAD, I
ALSAQABI, K
AGARWAL, A
[J]. JOURNAL OF SYSTEMS AND SOFTWARE, 1994, 25 (01) : 39 - 50
[9] Fault-tolerance in distributed real-time systems
Jahanian, F
[J]. THIRD INTERNATIONAL WORKSHOP ON REAL-TIME COMPUTING SYSTEMS AND APPLICATIONS, PROCEEDINGS, 1996, : 178 - 178
[10] A new algorithm for increasing fault-tolerance of distributed systems
Dishabi, Mohammad Reza Ebrahimi
Sharifi, Mohsen
[J]. PROCEEDINGS OF THE SIXTH IASTED INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS AND NETWORKS, 2007, : 96 - +

← 1 2 3 4 5 →