Implementation of Watch Dog Timer for Fault Tolerant Computing on Cluster Server

被引:0
|
作者
Bheevgade, Meenakshi [1 ]
Patrikar, Rajendra M. [1 ]
机构
[1] Visvesvaraya Natl Inst Technol, Nagpur 440010, Maharashtra, India
关键词
Cluster; Fault tolerant; Grid; Grid Computing System; Meta-computing;
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In today's new technology era, cluster has become a necessity for the modern computing and data applications since many applications take more time (even days or months) for computation. Although after parallelization, computation speeds up, still time required for much application can be more. Thus, reliability of the cluster becomes very important issue and implementation of fault tolerant mechanism becomes essential. The difficulty in designing a fault tolerant cluster system increases with the difficulties of various failures. The most imperative obsession is that the algorithm, which avoids a simple failure in a system, must tolerate the more severe failures. In this paper, we implemented the theory of watchdog timer in a parallel environment, to take care of failures. Implementation of simple algorithm in our project helps us to take care of different types of failures; consequently, we found that the reliability of this cluster improves.
引用
收藏
页码:265 / 268
页数:4
相关论文
共 50 条
  • [41] Implementation of Fault Tolerant Techniques into FPNNs
    Krcma, Martin
    Kotasek, Zdenek
    Lojda, Jakub
    2016 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2016, : 297 - 298
  • [42] Client server computing: Implementation experiences of practitioners
    Lockwood, DL
    JOURNAL OF COMPUTER INFORMATION SYSTEMS, 1997, 38 (02) : 26 - 34
  • [43] A Timer-Free Fault Tolerant K-Mutual Exclusion Algorithm
    Bouillaguet, Mathieu
    Arantes, Luciana
    Sens, Pierre
    LADC: 2009 4TH LATIN-AMERICAN SYMPOSIUM ON DEPENDABLE COMPUTING, 2009, : 41 - +
  • [44] FAULT-TOLERANT COMPUTING - INTRODUCTION AND A PERSPECTIVE
    KIME, CR
    IEEE TRANSACTIONS ON COMPUTERS, 1975, C 24 (05) : 457 - 460
  • [45] THE CONSENSUS PROBLEM IN FAULT-TOLERANT COMPUTING
    BARBORAK, M
    MALEK, M
    DAHBURA, A
    COMPUTING SURVEYS, 1993, 25 (02) : 171 - 220
  • [46] FAULT-TOLERANT COMPUTING - INTRODUCTION AND AN OVERVIEW
    RAMAMOORTHY, CV
    IEEE TRANSACTIONS ON COMPUTERS, 1971, C 20 (11) : 1241 - +
  • [47] Robust TCP connections for fault tolerant computing
    Ekwall, R
    Urbán, P
    Schiper, A
    NINTH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS, PROCEEDINGS, 2002, : 501 - 508
  • [48] A short history of fault-tolerant computing
    Avizienis, Algirdas
    IT - Information Technology, 1988, 30 (03): : 162 - 168
  • [49] Investigating fault tolerant computing systems reliability
    Distefano, Salvatore
    2008 IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL & DISTRIBUTED PROCESSING, VOLS 1-8, 2008, : 314 - +
  • [50] Abstractions for fault-tolerant global computing
    Chothia, T
    Duggan, D
    THEORETICAL COMPUTER SCIENCE, 2004, 322 (03) : 567 - 613