Improving the performance of checkpointing scheme with task duplication

被引:0
|
作者
Li, Kaiyuan [1 ]
Yang, Xiaozong [1 ]
机构
[1] Harbin Inst of Technology, Harbin, China
来源
关键词
Computer system recovery;
D O I
暂无
中图分类号
学科分类号
摘要
Checkpointing is a common technique for reducing the execution time of programs under the fault assumption. With the combination of checkpointing and task duplication, not only effective fault recovery but also perfect fault detection can be achieved. The overhead of such systems comes from two aspects:comparing and saving operation at each checkpoint, and the rollbacks caused by faults. This paper improves the method presented by Zlv and Bruck by employing incremental checkpointing. The improved method can reduce the overhead of comparing and saving operation, and moreover the rollbacks caused by latent faults can be avoided. Analysis show that thatour method exhibits better performance through comparison with that of Ziv and Bruck.
引用
收藏
页码:33 / 35
相关论文
共 50 条
  • [41] Consistent checkpointing for high performance clusters
    Nishioka, T
    Hori, A
    Ishikawa, Y
    CLUSTER 2000: IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, PROCEEDINGS, 2000, : 367 - 368
  • [42] Analyzing and improving the Epckpt: truly-transparent checkpointing
    Deng, Xiaobing
    Pang, Liping
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2002, 30 (01):
  • [43] Improving the Scalability of Transparent Checkpointing for GPU Computing Systems
    Amrizal, Alfian
    Hirasawa, Shoichi
    Komatsu, Kazuhiko
    Takizawa, Hiroyuki
    Kobayashi, Hiroaki
    TENCON 2012 - 2012 IEEE REGION 10 CONFERENCE: SUSTAINABLE DEVELOPMENT THROUGH HUMANITARIAN TECHNOLOGY, 2012,
  • [44] Design, implementation, and performance of checkpointing in NetSolve
    Agbaria, A
    Plank, JS
    DSN 2000: INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS, PROCEEDINGS, 2000, : 49 - 54
  • [45] A Dynamic Checkpointing and Rollback Recovery Solution Based on Task Switching
    Shao, Changheng
    Shao, Fengjing
    Song, Xiaoning
    Sun, Rencheng
    2009 INTERNATIONAL SYMPOSIUM ON INTELLIGENT INFORMATION SYSTEMS AND APPLICATIONS, PROCEEDINGS, 2009, : 354 - 358
  • [46] An Effective Combination Scheme for Improving Speaker Verification Performance
    Dutta, Krishna
    Mishra, Jagabandhu
    Pati, Debadatta
    TENCON 2017 - 2017 IEEE REGION 10 CONFERENCE, 2017, : 1296 - 1299
  • [47] Two-level checkpointing and verifications for linear task graphs
    Benoit, Anne
    CaveIan, Aurelien
    Robert, Yves
    Sun, Hongyang
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 1239 - 1248
  • [48] Object duplication for improving reliability
    Chen, G.
    Chen, G.
    Kandemir, M.
    Vijaykrishnan, N.
    Irwin, M. J.
    ASP-DAC 2006: 11TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, PROCEEDINGS, 2006, : 140 - 145
  • [49] A scalable task duplication based algorithm for improving the schedulability of real-time heterogeneous multiprocessor systems
    Auluck, N
    Agrawal, DP
    2003 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING WORKSHOPS, PROCEEDINGS, 2003, : 89 - 96
  • [50] A checkpointing-recovery scheme for Time Warp parallel simulation
    Cortellessa, V
    Quaglia, F
    PARALLEL COMPUTING, 2001, 27 (09) : 1227 - 1252