Towards optimal fault tolerant scheduling in computational grid

被引:0
|
作者
Imran, Muhammad [1 ]
Niaz, Iftikhar Azim [1 ]
Haider, Sajjad [2 ]
Hussain, Naveed [2 ]
Ansari, M. A. [3 ]
机构
[1] Riphah Int Univ, Fac Comp, Islamabad, Pakistan
[2] Natl Univ Modern Languages, Informat Technol Dept, Islamabad, Pakistan
[3] Federal Urdu Univ Arts Sci & Technol, Comp Sci Dept, Islamabad, Pakistan
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Grid environment has significant challenges due to diverse failures encountered during job execution. Computational grids provide the main execution platform for long running jobs. Such jobs require long commitment of grid resources. Therefore fault tolerance in such an environment cannot be ignored. Most of the grid middleware have either ignored failure issues or have developed adhoc solutions. Most of the existing fault tolerance techniques are application dependant and causes cognitive problem. This paper examines existing fault detection and tolerance techniques in various middleware. We have proposed fault tolerant layered grid architecture with cross-layered design. In our approach Hybrid Particle Swarm Optimization (HPSO) algorithm and Anycast technique are used in conjunction with the Globus middleware. We have adopted a proactive and reactive fault management strategy for centralized and distributed environments. The proposed strategy is helpful in identifying root cause of failures and resolving cognitive problem. Our strategy minimizes computation and communication thus achieving higher reliability. Anycast limits the effect of Denial of Service/Distributed Denial of Service D(DoS) attacks nearest to the source of the attack thus achieving better security. Significant performance improvement is achieved through using Anycast before HPSO. The selection of more reliable nodes results in less overhead of checkpointing.
引用
收藏
页码:154 / +
页数:2
相关论文
共 50 条
  • [31] Fault Tolerant Decentralized Scheduling Algorithm for P2P Grid
    Chauhan, Piyush
    Nitin
    [J]. 2ND INTERNATIONAL CONFERENCE ON COMMUNICATION, COMPUTING & SECURITY [ICCCS-2012], 2012, 1 : 698 - 707
  • [32] An algorithm for online distributed fault-tolerant job scheduling in grid computing
    Zeng, Jun
    [J]. INTERNATIONAL JOURNAL OF WEB AND GRID SERVICES, 2021, 17 (04) : 389 - 407
  • [33] Fuzzy Logic-Based Secure and Fault Tolerant Job Scheduling in Grid
    王乘
    蒋从锋
    刘小虎
    [J]. Tsinghua Science and Technology, 2007, (S1) : 45 - 50
  • [34] GA-based Job Scheduling Strategies for Fault Tolerant Grid Systems
    Wu, Chao-Chin
    Lai, Kuan-Chou
    Sun, Ren-Yi
    [J]. 2008 IEEE ASIA-PACIFIC SERVICES COMPUTING CONFERENCE, VOLS 1-3, PROCEEDINGS, 2008, : 27 - +
  • [35] Replication based fault tolerant job scheduling strategy for economy driven grid
    Babar Nazir
    Kalim Qureshi
    Paul Manuel
    [J]. The Journal of Supercomputing, 2012, 62 : 855 - 873
  • [36] Replication based fault tolerant job scheduling strategy for economy driven grid
    Nazir, Babar
    Qureshi, Kalim
    Manuel, Paul
    [J]. JOURNAL OF SUPERCOMPUTING, 2012, 62 (02): : 855 - 873
  • [37] Fault-tolerant scheduling of fine-grained tasks in grid environments
    Wrzesinska, G
    van Nieuwpoort, RV
    Maassen, J
    Kielmann, T
    Bal, HE
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2006, 20 (01): : 103 - 114
  • [38] FTM2: Fault Tolerant Batch Mode Heuristics in Computational Grid
    Panda, Sanjaya Kumar
    Khilar, Pabitra Mohan
    Mohapatra, Durga Prasad
    [J]. DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, ICDCIT 2014, 2014, 8337 : 98 - 104
  • [39] A fault-tolerant hybrid resource allocation model for dynamic computational grid
    Sheikh, Sophiya
    Nagaraju, A.
    Shahid, Mohammad
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2021, 48
  • [40] Exploiting tuple spaces to provide fault-tolerant scheduling on computational grids
    Favarim, Fabio
    Fraga, Joni da Silva
    Lung, Lau Cheuk
    Correia, Miguel
    Santos, Joao Felipe
    [J]. 10TH IEEE INTERNATIONAL SYMPOSIUM ON OBJECT AND COMPONENT-ORIENTED REAL-TIME DISTRIBUTED COMPUTING, PROCEEDINGS, 2007, : 403 - +