A non-blocking Checkpointing algorithm for distributed systems

被引:0
|
作者
Guoliang L. [1 ]
Shuyu C. [1 ]
Xiaoqin Z. [1 ]
机构
[1] College of Computer Science, Chongqing University, Chongqing
关键词
Coordinated Check pointing; Distributed systems; Failure and recovery; Fault tolerance;
D O I
10.4156/jdcta.vol5.issue7.29
中图分类号
学科分类号
摘要
The technology of Check pointing and rollback recovery as an effective method of fault tolerance, has been used widely on the parallel or distributed computer systems. We have presented a nonblocking coordinated Check pointing algorithm for distributed systems, which are differ from the conventional approach of taking first temporary checkpoints and then converting them to permanent ones by processes. The proposed Check pointing algorithm allows processes to take permanent checkpoints directly, without taking temporary checkpoints. The character of the algorithm contributes to its speed of execution. The orphan messages are eliminated by sender processes and the in-transit messages are eliminated by Check pointing interval and retransmission mechanism. While reducing the complexity of control message during gain checkpoints from O(n2) to O(n), the algorithm's controlling messages are reduced to n-1.
引用
收藏
页码:230 / 238
页数:8
相关论文
共 50 条
  • [41] A Scalable Communication-Induced Checkpointing Algorithm for Distributed Systems
    Simon, Alberto Calixto
    Hernandez, Saul E. Pomares
    Cruz, Jose Roberto Perez
    Gomez-Gil, Pilar
    Drira, Khalil
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (04) : 886 - 896
  • [42] An index-based checkpointing algorithm for autonomous distributed systems
    Baldoni, R
    Quaglia, F
    Fornara, P
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 1999, 10 (02) : 181 - 192
  • [43] An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs*
    Hemalatha Eedi
    Sahith Karra
    Sathya Peri
    Neha Ranabothu
    Rahul Utkoor
    International Journal of Parallel Programming, 2022, 50 : 381 - 404
  • [44] An index-based checkpointing algorithm for autonomous distributed systems
    Baldoni, R
    Quaglia, F
    Fornara, P
    SIXTEENTH SYMPOSIUM ON RELIABLE DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, : 27 - 34
  • [45] An Efficient Practical Non-Blocking PageRank Algorithm for Large Scale Graphs
    Eedi, Hemalatha
    Peri, Sathya
    Ranabothu, Neha
    Utkoor, Rahul
    2021 29TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2021), 2021, : 35 - 43
  • [46] An Improved/Optimized Practical Non-Blocking PageRank Algorithm for Massive Graphs
    Eedi, Hemalatha
    Karra, Sahith
    Peri, Sathya
    Ranabothu, Neha
    Utkoor, Rahul
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2022, 50 (3-4) : 381 - 404
  • [47] A Crosstalk Free Routing Algorithm of Generalized Recursive Non-blocking Network
    Sultana, Most Arjuman
    Chowdhury, Gita
    Rahman, M. M. Hafizur
    2008 11TH INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY: ICCIT 2008, VOLS 1 AND 2, 2008, : 622 - 627
  • [48] PATCH: A Plug-in Framework of Non-blocking Inference for Distributed Multimodal System
    Wang, Juexing
    Wang, Guangjing
    Zhang, Xiao
    Liu, Li
    Zeng, Huacheng
    Xiao, Li
    Cao, Zhichao
    Gu, Lin
    Li, Tianxing
    PROCEEDINGS OF THE ACM ON INTERACTIVE MOBILE WEARABLE AND UBIQUITOUS TECHNOLOGIES-IMWUT, 2023, 7 (03):
  • [49] Efficient non-blocking top-k query processing in distributed networks
    Deng, Bo
    Jia, Yan
    Yang, Shuqiang
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2006, 3882 : 880 - 889
  • [50] The Anchor Verifier for Blocking and Non-blocking Concurrent Software
    Flanagan, Cormac
    Freund, Stephen N.
    PROCEEDINGS OF THE ACM ON PROGRAMMING LANGUAGES-PACMPL, 2020, 4 (04):