Implementation of Watch Dog Timer for Fault Tolerant Computing on Cluster Server

被引:0
|
作者
Bheevgade, Meenakshi [1 ]
Patrikar, Rajendra M. [1 ]
机构
[1] Visvesvaraya Natl Inst Technol, Nagpur 440010, Maharashtra, India
关键词
Cluster; Fault tolerant; Grid; Grid Computing System; Meta-computing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In today's new technology era, cluster has become a necessity for the modern computing and data applications since many applications take more time (even days or months) for computation. Although after parallelization, computation speeds up, still time required for much application can be more. Thus, reliability of the cluster becomes very important issue and implementation of fault tolerant mechanism becomes essential. The difficulty in designing a fault tolerant cluster system increases with the difficulties of various failures. The most imperative obsession is that the algorithm, which avoids a simple failure in a system, must tolerate the more severe failures. In this paper, we implemented the theory of watchdog timer in a parallel environment, to take care of failures. Implementation of simple algorithm in our project helps us to take care of different types of failures; consequently, we found that the reliability of this cluster improves.
引用
收藏
页码:265 / 268
页数:4
相关论文
共 50 条
  • [1] Fault tolerant cluster computing through replication
    Shum, KH
    1997 INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 1997, : 756 - 761
  • [2] A Fault Tolerant Approach in Cluster Computing System
    Shwe, Thanda
    Aye, Win
    ECTI-CON 2008: PROCEEDINGS OF THE 2008 5TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING/ELECTRONICS, COMPUTER, TELECOMMUNICATIONS AND INFORMATION TECHNOLOGY, VOLS 1 AND 2, 2008, : 149 - +
  • [3] Design and Implementation of Autonomic Computing System for Server Cluster
    Liu, Wenjie
    Zhou, Yuntao
    DCABES 2008 PROCEEDINGS, VOLS I AND II, 2008, : 35 - 40
  • [4] A Cluster-Based Implementation of a Fault Tolerant Parallel Reduction Algorithm Using Swarm-Array Computing
    Varghese, Blesson
    McKee, Gerard
    Alexandrov, Vassil
    SIXTH INTERNATIONAL CONFERENCE ON AUTONOMIC AND AUTONOMOUS SYSTEMS: ICAS 2010, PROCEEDINGS, 2010, : 30 - 36
  • [5] Fault-tolerant PACS server
    Cao, F
    Liu, BJ
    Huang, HK
    Zhou, MZ
    Zhang, J
    Zhang, X
    Mogel, G
    MEDICAL IMAGING 2002: PACS AND INTEGRATED MEDICAL INFORMATION SYSTEMS: DESIGN AND EVALUATION, 2002, 4685 : 316 - 325
  • [6] A FAULT-TOLERANT SERVER ON MACH
    AREVALO, S
    CARRETERO, J
    CASTELLANOS, JL
    BARCO, F
    MICROPROCESSING AND MICROPROGRAMMING, 1993, 38 (1-5): : 793 - 800
  • [7] Fault tolerant implementation
    Eliaz, K
    REVIEW OF ECONOMIC STUDIES, 2002, 69 (03): : 589 - 610
  • [8] The Future of Fault Tolerant Computing
    Abraham, Jacob
    Iyer, Ravishankar
    Gizopoulos, Dimitris
    Alexandrescu, Dan
    Zorian, Yervant
    2015 IEEE 21ST INTERNATIONAL ON-LINE TESTING SYMPOSIUM (IOLTS), 2015, : 108 - 109
  • [9] FAULT-TOLERANT COMPUTING
    TOY, WN
    ADVANCES IN COMPUTERS, 1987, 26 : 201 - 279
  • [10] FAULT-TOLERANT COMPUTING
    PRADHAN, DK
    COMPUTER, 1980, 13 (03) : 6 - 7