Fault tolerance using "Parallel shadow image servers (PSIS)" in grid based computing environment

被引：3

作者：

Hussain, Naveed ^{[1
]}

Ansari, M. A.

Yasin, M. M.

Rauf, Abdul

Haider, Sajjad

机构：

[1] Natl Univ Modern Languages, Dept Informat Technol, Islamabad, Pakistan

[2] Fed Urdu Univ Arts Sci & Technol, Dept Comp Sci, Islamabad, Pakistan

[3] COMSATS Inst Informat Technol, Dept Comp Sci, Islamabad, Pakistan

来源：

SECOND INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES 2006, PROCEEDINGS | 2006年

关键词：

grid computing; fault tolerance; PSIS; condor; cactus; job scheduling;

D O I：

10.1109/ICET.2006.335982

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper will present a critical review, of the existing fault tolerance mechanism in grid computing and the overhead involved in terms of reprocessing or rescheduling of jobs, if in case a fault arisen For this purpose we suggested the Parallel Shadow Image Server (PSIS) copying techniques in parallel to the Resource Manager for having the check points for rescheduling of jobs from the nearest flag, if in case the fault is detected. The job process is to be scheduled from the resource manager node to the worker nodes and then its' submitted back by the worker nodes in serialized form to the Parallel Shadow Image Servers from the worker nodes after the pre-specified amount of time, which we call the recent spawn or the flag check point for rescheduling or reprocessing of job. If the fault is arisen then the rescheduling will be done from the recent check point and will be submitted to the worker rode from where the job was terminated. This will not only save time but will improve the performance up to major extent.

引用

页码：703 / 707

页数：5

共 50 条

[21] A Replication Strategy for Fault Tolerance in Data Grid Environment
Li, Jing
ACC 2009: ETP/IITA WORLD CONGRESS IN APPLIED COMPUTING, COMPUTER SCIENCE, AND COMPUTER ENGINEERING, 2009, : 363 - 366
[22] Nomadic migration: Fault tolerance in a disruptive grid environment
Lanfermann, G
Allen, G
Radke, T
Seidel, E
CCGRID 2002: 2ND IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER COMPUTING AND THE GRID, PROCEEDINGS, 2002, : 280 - 281
[23] Fault tolerance in cloud computing environment: A systematic survey
Hasan, Moin
Goraya, Singh
COMPUTERS IN INDUSTRY, 2018, 99 : 156 - 172
[24] Failover strategy for fault tolerance in cloud computing environment
Mohammed, Bashir
Kiran, Mariam
Maiyama, Kabiru M.
Kamala, Mumtaz M.
Awan, Irfan-Ullah
SOFTWARE-PRACTICE & EXPERIENCE, 2017, 47 (09): : 1243 - 1274
[25] FAULT TOLERANCE TASK EXECUTION THROUGH COOPERATIVE COMPUTING IN GRID
Goraya, Major Singh
Kaur, Lakhwinder
PARALLEL PROCESSING LETTERS, 2013, 23 (01)
[26] Adaptive Checkpointing for Fault Tolerance in an Autonomous Mobile Computing Grid
Jaggi, Parmeet Kaur
Singh, Awadhesh Kumar
2014 INTERNATIONAL CONFERENCE ON CONTEMPORARY COMPUTING AND INFORMATICS (IC3I), 2014, : 553 - 557
[27] Performance evaluation of fault tolerance techniques in grid computing system
Khan, Fiaz Gul
Qureshi, Kalim
Nazir, Babar
COMPUTERS & ELECTRICAL ENGINEERING, 2010, 36 (06) : 1110 - 1122
[28] Wide and fault diameter in Kneser graphs for enhanced fault tolerance in parallel computing
Sundara Rajan, R.
Kirithiga Nandini, G.
Lin, Yuqing
Reji, Remi Mariam
International Journal of Networking and Virtual Organisations, 2024, 31 (03) : 169 - 190
[29] Semantic Image Retrieval in a Grid Computing Environment Using Support Vector Machines
Irtaza, Aun
Jaffar, M. Arfan
Mahmood, Muhammad Tariq
COMPUTER JOURNAL, 2014, 57 (02): : 205 - 216
[30] A parallel and fault tolerant file system based on NFS servers
García, F
Calderón, A
Carretero, J
Pérez, JM
Fernández, J
ELEVENTH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, PROCEEDINGS, 2003, : 83 - 90

← 1 2 3 4 5 →