Transparent parallel checkpointing and migration in clusters and ClusterGrids

被引:2
|
作者
Kovacs, Jozsef [1 ]
机构
[1] MTA SZTAKI, Parallel & Distributed Syst Lab, POB 63, H-1518 Budapest, Hungary
关键词
MP; message passing; parallel; checkpoint; migration; cluster; grid; clustergrid; pvm; Condor; graphical programming environment;
D O I
10.1504/IJCSE.2009.027379
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
This paper introduces a novel approach in parallel checkpointing aimed at supporting fault-tolerance and migration among clusters of a ClusterGrid environment with various middleware components. Based on an architectural analysis, compatibility and integrity requirements are identified and corresponding conditions are established. Some of the available checkpointing systems are checked against the conditions in order to examine their conformity. Finally, a novel checkpointing approach is defined and the Parallel Grid Runtime and Application Development Environment (P-GRADE) Grid Programming Tool is adapted.
引用
收藏
页码:171 / 181
页数:11
相关论文
共 50 条
  • [1] Application and middleware transparent checkpointing with TCKPT on ClusterGrids
    Kovacs, Jozsef
    Kacsuk, Peter
    Januszewski, Radoslaw
    Jankowski, Gracjan
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2010, 26 (03): : 498 - 503
  • [2] CHPOX: Transparent checkpointing system for Linux clusters
    Sudakov, Oleksandr O.
    Meshcheriakov, Ievgenii S.
    Boyko, Yuriy V.
    IDAACS 2007: PROCEEDINGS OF THE 4TH IEEE WORKSHOP ON INTELLIGENT DATA ACQUISITION AND ADVANCED COMPUTING SYSTEMS: TECHNOLOGY AND APPLICATIONS, 2007, : 159 - +
  • [3] Improving Preemptive Scheduling with Application-Transparent Checkpointing in Shared Clusters
    Li, Jack
    Pu, Calton
    Chen, Yuan
    Talwar, Vanish
    Milojicic, Dejan
    PROCEEDINGS OF THE 16TH ANNUAL MIDDLEWARE CONFERENCE, 2015, : 222 - 234
  • [4] Efficient user-level thread migration and checkpointing on windows NT clusters
    Abdel-Shafi, H
    Speight, E
    Bennete, JK
    PROCEEDINGS OF THE 3RD USENIX WINDOWS NT SYMPOSIUM, 1999, : 1 - 10
  • [5] Application and middleware transparent checkpointing with tckpt on clustergrid -: A novel checkpointing approach
    Kovacs, Jozsef
    Mikolajczak, Rafal
    Januszewski, Radoslaw
    Jankowski, Gracjan
    DISTRIBUTED AND PARALLEL SYSTEMS: FROM CLUSTER TO GRID COMPUTING, 2007, : 179 - +
  • [6] Parallel Data Migration Framework on Linux Clusters
    Mudawar, Muhamed F.
    AlGhuson, Mohammed K.
    ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2011, 36 (05) : 785 - 794
  • [7] Parallel Data Migration Framework on Linux Clusters
    Muhamed F. Mudawar
    Mohammed K. AlGhuson
    Arabian Journal for Science and Engineering, 2011, 36 : 785 - 794
  • [8] Making parallel processing on clusters efficient, transparent and easy for programmers
    Goscinski, AM
    FIRST IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2001, : 8 - 9
  • [9] Consistent checkpointing for high performance clusters
    Nishioka, T
    Hori, A
    Ishikawa, Y
    CLUSTER 2000: IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2000, : 367 - 368
  • [10] System-Level Transparent Checkpointing for OpenSHMEM
    Garg, Rohan
    Vienne, Jerome
    Cooperman, Gene
    OPENSHMEM AND RELATED TECHNOLOGIES: ENHANCING OPENSHMEM FOR HYBRID ENVIRONMENTS, 2016, 10007