Managing Dynamic Reconfiguration for Fault-tolerance on a Manycore Architecture

被引:1
|
作者
Zain-ul-Abdin [1 ]
Gebrewahid, Essayas [1 ]
Svensson, Bertil [1 ]
机构
[1] Halmstad Univ, Ctr Res Embedded Syst, Halmstad, Sweden
关键词
D O I
10.1109/IPDPSW.2012.38
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
With the advent of manycore architectures comprising hundreds of processing elements, fault management has become a major challenge. We present an approach that uses the occam-pi language to manage the fault recovery mechanism on a new manycore architecture, the Platform 2012 (P2012). The approach is made possible by extending our previously developed compiler framework to compile occam-pi implementations to the P2012 architecture. We describe the techniques used to translate the salient features of the occam-pi language to the native programming model of the P2012 architecture. We demonstrate the applicability of the approach by an experimental case study, in which the DCT algorithm is implemented on a set of four processing elements. During runtime, some of the tasks are then relocated from assumed faulty processing elements to the faultless ones by means of dynamic reconfiguration of the hardware. The working of the demonstrator and the simulation results illustrate not only the feasibility of the approach but also how the use of higher-level abstractions simplifies the fault handling.
引用
下载
收藏
页码:312 / 319
页数:8
相关论文
共 50 条
  • [1] Towards fault-tolerance of IMA with safe dynamic reconfiguration
    Schubert, Tim
    Friedrich, Sven
    Zaeske, Wanja
    Durak, Umut
    CEAS Aeronautical Journal, 2024, 15 (04) : 1223 - 1234
  • [2] Fault-tolerance of computation systems with functional reconfiguration
    Bogatyrev, V.A.
    Pribory i Sistemy Upravleniya, 2001, (11): : 51 - 54
  • [3] Improving fault-tolerance in intelligent video surveillance by monitoring, diagnosis and dynamic reconfiguration
    Doblander, A
    Maier, A
    Rinner, B
    Schwabach, H
    PROCEEDINGS OF THE THIRD INTERNATIONAL WORKSHOP ON INTELLIGENT SOLUTIONS IN EMBEDDED SYSTEMS, 2005, : 194 - 201
  • [4] High speed dynamic fault-tolerance
    Sengupta, J
    Bansal, PK
    IEEE REGION 10 INTERNATIONAL CONFERENCE ON ELECTRICAL AND ELECTRONIC TECHNOLOGY, VOLS 1 AND 2, 2001, : 669 - 675
  • [5] Fault-Tolerance Mechanism for Self-Reconfiguration of Modular Robots
    Bassil, Jad
    Tannoury, Perla
    Piranda, Benoit
    Makhoul, Abdallah
    Bourgeois, Julien
    2022 INTERNATIONAL WIRELESS COMMUNICATIONS AND MOBILE COMPUTING, IWCMC, 2022, : 360 - 365
  • [6] A framework for reconfiguration-based fault-tolerance in distributed systems
    Porcarelli, S
    Castaldi, M
    Di Giandomenico, F
    Bondavalli, A
    Inverardi, P
    ARCHITECTING DEPENDABLE SYSTEMS II, 2004, 3069 : 167 - 190
  • [7] FAULT-TOLERANCE IN PYRAMID TREE NETWORK ARCHITECTURE
    MOHSIN, M
    GUPTA, B
    COMPUTER SYSTEMS SCIENCE AND ENGINEERING, 1995, 10 (03): : 164 - 172
  • [8] Dynamic scheduling and fault-tolerance: Specification and verification
    Janowski, T
    Joseph, M
    REAL-TIME SYSTEMS, 2001, 20 (01) : 51 - 81
  • [9] A DYNAMIC FAULT-TOLERANCE FRAMEWORK FOR REMOTE ROBOTS
    VISINSKY, ML
    CAVALLARO, JR
    WALKER, ID
    IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 1995, 11 (04): : 477 - 490
  • [10] Dynamic Scheduling and Fault-Tolerance: Specification and Verification
    Tomasz Janowski
    Mathai Joseph
    Real-Time Systems, 2001, 20 : 51 - 81