An efficient computing-checkpoint based coordinated checkpoint algorithm

被引:0
|
作者
Men Chaoguang [1 ]
Wang Dongsheng
Zhao Yunlong
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Beijing 100084, Peoples R China
[2] Harbin Engn Univ, Res Ctr High Dependabil Comp Technol, Harbin 150001, Heilongjiang, Peoples R China
关键词
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, the concept of "computing checkpoint" is introduced, and then an efficient coordinated checkpoint algorithm is proposed. The algorithm combines the two approaches of reducing the overhead associated with coordinated checkpointing, which one is to minimize the processes which take checkpoints and the other is to make the checkpointing process non-blocking. Through piggybacking the information including which processes have taken new checkpoint in the broadcast committing message, the checkpoint sequence number of every process can be kept consistent in all processes, so that the unnecessary checkpoints and orphan messages can be avoided in the future running. Evaluation result shows that the number of redundant computing checkpoints is less than 1/10 of the number of tentative checkpoints. Analyses and experiments show that the overhead of our algorithm is lower than that of other coordinated checkpoint algorithms.
引用
收藏
页码:99 / 109
页数:11
相关论文
共 50 条
  • [1] Using computing checkpoint implement efficient coordinated checkpointing
    Men, CG
    Wang, NB
    Zhao, YL
    CHINESE JOURNAL OF ELECTRONICS, 2006, 15 (02): : 193 - 196
  • [2] A Distributed Counter-based Non-blocking Coordinated Checkpoint Algorithm for Grid Computing Applications
    El-Sayed, Gamal A.
    Hossny, Khadra A.
    2012 2ND INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTATIONAL TOOLS FOR ENGINEERING APPLICATIONS (ACTEA), 2012, : 80 - 85
  • [3] An efficient handoff strategy for mobile computing checkpoint system
    Men, Chaoguang
    Xu, Zhenpeng
    Wang, Dongsheng
    EMBEDDED AND UBIQUITOUS COMPUTING, PROCEEDINGS, 2007, 4808 : 410 - +
  • [4] An efficient incremental algorithm for identifying consistent checkpoint
    Chen, LB
    Wu, IC
    1998 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1998, : 122 - 129
  • [5] Efficient checkpoint-based failure recovery techniques in mobile computing systems
    Lin, CM
    Dow, CR
    JOURNAL OF INFORMATION SCIENCE AND ENGINEERING, 2001, 17 (04) : 549 - 573
  • [6] Incremental Checkpoint Based Failure-Aware Scheduling Algorithm in Grid Computing
    Singh, Manjeet
    2016 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2016, : 772 - 778
  • [7] Process migration for MPI applications based on coordinated checkpoint
    Cao, JN
    Li, YH
    Guo, MY
    11TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL I, PROCEEDINGS, 2005, : 306 - 312
  • [8] Process migration for MPI applications based on coordinated checkpoint
    Cao, J. (csjcao@comp.polyu.edu.hk), IEEE Computer Society TCDP and TCPP; Fukuoka Institute of Technology, FIT, Japan (Institute of Electrical and Electronics Engineers Computer Society):
  • [9] Performance Analysis of Checkpoint Based Efficient Failure-Aware Scheduling Algorithm
    Singh, Manjeet
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND AUTOMATION (ICCCA), 2017, : 859 - 863
  • [10] Efficient Incremental Checkpoint based on Hybrid Page
    Wang, Ruibo
    Zhang, Wenzhe
    PROCEEDINGS OF 2017 6TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2017), 2017, : 184 - 188