Performance optimization of checkpointing schemes with task duplication

被引:0
|
作者
Li, Zhongwen [1 ,3 ]
Xiang, Yang [2 ]
Chen, Hong [1 ]
机构
[1] Xiamen Univ, Informat Sci & Technol Coll, Xiamen 361005, Peoples R China
[2] Deakin Univ, Sch Engn& Informat Technol, Geelong, Vic, Australia
[3] Zhongshan Inst UESTC, Zhongshan 528402, Peoples R China
关键词
fault-tolerant computing; checkpointing intervals; task duplication; performance optimization;
D O I
10.1109/IMSCCS.2006.250
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Using store-checkpoints (SCPs) and compare-checkpoints (CCPs), we present an adaptive checkpointing scheme that dynamically adjusts the checkpointing interval on line in this paper With additional SCPs and CCPs, we can use both the comparison and storage operations in an efficient way and improve the performance of checkpointing schemes. Further we obtain methods to calculate the optimal numbers of checkpoints by which minimize the mean execution times. Simulation results show that compared to previous methods, the proposed approach significantly increases the likelihood of timely task completion in the present of faults.
引用
收藏
页码:671 / +
页数:2
相关论文
共 50 条
  • [1] Performance optimization of checkpointing schemes with task duplication
    Ziv, A
    Bruck, J
    IEEE TRANSACTIONS ON COMPUTERS, 1997, 46 (12) : 1381 - 1386
  • [2] Analysis of checkpointing schemes with task duplication
    Ziv, A
    Bruck, J
    IEEE TRANSACTIONS ON COMPUTERS, 1998, 47 (02) : 222 - 227
  • [3] Improving the performance of checkpointing scheme with task duplication
    Li, Kaiyuan
    Yang, Xiaozong
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2000, 28 (05): : 33 - 35
  • [4] Optimal checkpointing interval for task duplication with spare processing
    Nakagawa, S
    Okuda, Y
    Yamada, S
    NINTH ISSAT INTERNATIONAL CONFERENCE ON RELIABILITY AND QUALITY IN DESIGN, 2003 PROCEEDINGS, 2003, : 215 - 219
  • [5] High Performance Computing Systems with Various Checkpointing Schemes
    Naksinehaboon, N.
    Paun, M.
    Nassar, R.
    Leangsuksun, B.
    Scott, S.
    INTERNATIONAL JOURNAL OF COMPUTERS COMMUNICATIONS & CONTROL, 2009, 4 (04) : 386 - 400
  • [6] Augmenting work-greedy assignment schemes with task duplication
    Manoharan, S
    1997 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, : 772 - 779
  • [7] Performance analysis of different checkpointing and recovery schemes using stochastic model
    Mandal, PS
    Mukhopadhyaya, K
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2006, 66 (01) : 99 - 107
  • [8] A new approach for high performance computing systems with various checkpointing schemes
    Gyung-Leen Park
    Hee Yong Youn
    Youn, Hee Yong (youn@ece.skku.ac.kr), 2005, Springer (33): : 1 - 2
  • [9] A new approach for high performance computing systems with various checkpointing schemes
    Park, GL
    Youn, HY
    JOURNAL OF SUPERCOMPUTING, 2005, 33 (1-2): : 65 - 78
  • [10] The performance of checkpointing and replication schemes for fault tolerant mobile agent systems
    Park, TS
    Byun, IS
    Kim, HJ
    Yeom, HY
    21ST IEEE SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 256 - 261