Distributed fault management for computational grids

被引:0
|
作者
Affaan, Muhammad [1 ]
Ansari, M. A. [2 ]
机构
[1] Muhammad Ali Jinnah Univ, Islamabad, Pakistan
[2] Fed Urdu Univ Arts, Sci & Tech, Islamabad, Pakistan
关键词
grid environment; fault management; check pointing; single point of failure;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Grid resources having heterogeneous architectures, being geographically distributed and interconnected via unreliable network media, are at the risk of failure. Grid environment consists of unreliable resources; therefore, fault tolerant mechanisms can not be ignored. Some scientific jobs require long commitments of grid resources whose failures may not be overlooked. We need a flexible management of these failures by considering the failure of fault manager itself. In this paper we propose the concept of distributed management of failures without engaging the resources for this particular task exclusively. Resources performing the fault management may also participate in serving the long running user jobs. Each sub job of the main user job is inspected by an individual resource. In case of failure inspector resource takes over in place of inspected resource. Contributions of this paper are: elimination of single point of failure and proposed concept's ability to be integrated with variety of grid middleware.
引用
收藏
页码:363 / +
页数:2
相关论文
共 50 条
  • [31] Distributed Fault Detection Based on Credibility and Cooperation for WSNs in Smart Grids
    Shao, Sujie
    Guo, Shaoyong
    Qiu, Xuesong
    [J]. SENSORS, 2017, 17 (05):
  • [32] Computational model for the analysis of distributed generation in systems including smart grids
    Sepulveda, Camilo
    Resener, Mariana
    Haffner, Sergio
    Pereira, Luis A.
    [J]. 2015 IEEE PES INNOVATIVE SMART GRID TECHNOLOGIES LATIN AMERICA (ISGT LATAM), 2015, : 405 - 410
  • [33] Fault tolerant dynamic distributed computational grid systems
    Mitchell, LJ
    Reedy, D
    [J]. 7TH WORLD MULTICONFERENCE ON SYSTEMICS, CYBERNETICS AND INFORMATICS, VOL II, PROCEEDINGS: COMPUTER SCIENCE AND ENGINEERING, 2003, : 101 - 106
  • [34] Distributed job scheduling on computational grids using multiple simultaneous requests
    Subramani, V
    Kettimuthu, R
    Srinivasan, S
    Sadayappan, P
    [J]. 11TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 2002, : 359 - 366
  • [35] Distributed Online Optimal Energy Management for Smart Grids
    Zhang, Wei
    Xu, Yinliang
    Liu, Wenxin
    Zang, Chuanzhi
    Yu, Haibin
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2015, 11 (03) : 717 - 727
  • [36] Computational grids
    Fox, G
    Gannon, D
    [J]. COMPUTING IN SCIENCE & ENGINEERING, 2001, 3 (04) : 74 - 77
  • [37] Computational grids
    Foster, I
    Kesselman, C
    [J]. 1998 CERN SCHOOL OF COMPUTING, PROCEEDINGS, 1998, 98 (08): : 87 - 113
  • [38] COMPUTATIONAL GRIDS
    DENNING, PJ
    [J]. AMERICAN SCIENTIST, 1993, 81 (03) : 212 - 215
  • [39] Exploiting tuple spaces to provide fault-tolerant scheduling on computational grids
    Favarim, Fabio
    Fraga, Joni da Silva
    Lung, Lau Cheuk
    Correia, Miguel
    Santos, Joao Felipe
    [J]. 10TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT AND COMPONENT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2007, : 403 - +
  • [40] Task-oriented computational economic-based distributed resource allocation mechanisms for computational grids
    He, LL
    Ioerger, TR
    [J]. IC-AI '04 & MLMTA'04 , VOL 1 AND 2, PROCEEDINGS, 2004, : 462 - 468