Fault Tolerant Task Scheduling on Computational Grid Using Checkpointing Under Transient Faults

被引:0
|
作者
Ritu Garg
Awadhesh Kumar Singh
机构
[1] National Institute Of Technology,Computer Engineering Department
关键词
Grid computing; Task scheduling; Fault tolerance; Checkpointing; Weibull failure distribution; Genetic algorithm;
D O I
暂无
中图分类号
学科分类号
摘要
Application scheduling is crucial for grid computing environment. The failure of grid resources poses a great challenge to it. Most existing application scheduling algorithms deal with resource failures by employing reliability-aware scheduling without considering performance and do not adequately provide fault tolerance to them. In this paper, we proposed a fault tolerant task scheduling algorithm for independent and dependent (workflows) tasks considering reliability as well as the performance of grid resources. We focused on the Weibull distributed failures of grid resources in spite of commonly adopted assumption of Poisson failure distribution. To handle such failures, rollback recovery via checkpoint/restart is used for improving system dependability and reliability. The optimal checkpointing frequency is used with the goal to minimize the fault tolerance overhead (expected waste time). Based on minimal wasted time, a new factor known as capacity decreasing factor is generated. It considers both the performance and failure characteristics of the resources. Finally, the efficient scheduling decision is made using genetic algorithm considering the capacity decreasing factor by generating the new computing capacity of the resources in the presence of failures. The efficient scheduling solution is generated having both optimal performance (makespan) and reliability (i.e., the lowest tendency to fail). Further, precedence constraint of sub-tasks is also considered, where ordering of tasks is performed considering the precedence relationship and fault tolerance overhead. The simulation results show that our proposed fault tolerant scheduling algorithm achieves better performance and execution reliability than other previous algorithms in the presence of failures.
引用
收藏
页码:8775 / 8791
页数:16
相关论文
共 50 条
  • [1] Fault Tolerant Task Scheduling on Computational Grid Using Checkpointing Under Transient Faults
    Garg, Ritu
    Singh, Awadhesh Kumar
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2014, 39 (12) : 8775 - 8791
  • [2] Fault tolerant job scheduling in computational grid
    Nazir, Babar
    Khan, Taimoor
    [J]. SECOND INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES 2006, PROCEEDINGS, 2006, : 708 - +
  • [3] Towards optimal fault tolerant scheduling in computational grid
    Imran, Muhammad
    Niaz, Iftikhar Azim
    Haider, Sajjad
    Hussain, Naveed
    Ansari, M. A.
    [J]. THIRD INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES 2007, PROCEEDINGS, 2007, : 154 - +
  • [4] A dependable task scheduling strategy for a fault tolerant grid model
    Wang, YZ
    Lin, C
    Zhai, ZL
    Yang, Y
    [J]. ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, PROCEEDINGS, 2006, 3842 : 534 - 539
  • [5] Fault-tolerant scheduling of independent tasks in computational grid
    Zheng, Qin
    Veeravalli, Bharadwaj
    Tham, Chen-Khong
    [J]. 2006 10TH IEEE SINGAPORE INTERNATIONAL CONFERENCE ON COMMUNICATION SYSTEMS, VOLS 1 AND 2, 2006, : 102 - +
  • [6] Component based proactive fault tolerant scheduling in computational grid
    Haider, Sajjad
    Imran, Muhammad
    Niaz, Iftikhar Azim
    Ullah, Saeed
    Ansari, M. A.
    [J]. THIRD INTERNATIONAL CONFERENCE ON EMERGING TECHNOLOGIES 2007, PROCEEDINGS, 2007, : 119 - +
  • [7] Dynamic and Adaptive Fault Tolerant Scheduling With QoS Consideration in Computational Grid
    Haider, Sajjad
    Nazir, Babar
    [J]. IEEE ACCESS, 2017, 5 : 7853 - 7873
  • [8] High performance fault tolerant resource scheduling in computational grid environment
    Goswami S.
    Mukherjee K.
    [J]. International Journal of Web-Based Learning and Teaching Technologies, 2020, 15 (01) : 73 - 87
  • [9] A Novel Fault-tolerant Task Scheduling Algorithm for Computational Grids
    Naik, Jairam K.
    Satyanarayana, N.
    [J]. 2013 15TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING TECHNOLOGIES (ICACT), 2013,
  • [10] Automatic Checkpointing based Fault Tolerance in Computational Grid
    Babu, Ch. Ramesh
    Rao, Ch. D. V. Subba
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTING, MANAGEMENT AND TELECOMMUNICATIONS (COMMANTEL), 2014, : 41 - 45