Data intensive and network aware (DIANA) grid scheduling

被引:49
|
作者
McClatchey R. [1 ]
Anjum A. [1 ,3 ]
Stockinger H. [2 ]
Ali A. [3 ]
Willers I. [4 ]
Thomas M. [5 ]
机构
[1] CCS Research Centre, University of the West of England, Bristol
[2] Swiss Institute of Bioinformatics, Lausanne
[3] National University of Sciences and Technology, Rawalpindi
[4] CERN, European Organization for Nuclear Research, Geneva
[5] California Institute of Technology, Pasadena, CA
来源
J. Grid Comput. | 2007年 / 1卷 / 43-64期
关键词
Data intensive; Meta scheduling; Network awareness; Peer-to-peer architectures; Scheduling algorithm;
D O I
10.1007/s10723-006-9059-z
中图分类号
学科分类号
摘要
In Grids scheduling decisions are often made on the basis of jobs being either data or computation intensive: in data intensive situations jobs may be pushed to the data and in computation intensive situations data may be pulled to the jobs. This kind of scheduling, in which there is no consideration of network characteristics, can lead to performance degradation in a Grid environment and may result in large processing queues and job execution delays due to site overloads. In this paper we describe a Data Intensive and Network Aware (DIANA) meta-scheduling approach, which takes into account data, processing power and network characteristics when making scheduling decisions across multiple sites. Through a practical implementation on a Grid testbed, we demonstrate that queue and execution times of data-intensive jobs can be significantly improved when we introduce our proposed DIANA scheduler. The basic scheduling decisions are dictated by a weighting factor for each potential target location which is a calculated function of network characteristics, processing cycles and data location and size. The job scheduler provides a global ranking of the computing resources and then selects an optimal one on the basis of this overall access and execution cost. The DIANA approach considers the Grid as a combination of active network elements and takes network characteristics as a first class criterion in the scheduling decision matrix along with computations and data. The scheduler can then make informed decisions by taking into account the changing state of the network, locality and size of the data and the pool of available processing cycles. © Springer Science + Business Media B.V. 2007.
引用
收藏
页码:43 / 64
页数:21
相关论文
共 50 条
  • [31] An Efficiency-Aware Scheduling for Data-Intensive Computations on MapReduce Clusters
    Zhao, Hui
    Yang, Shuqiang
    Fan, Hua
    Chen, Zhikun
    Xu, Jinghu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (12): : 2654 - 2662
  • [32] Cloud-aware data intensive workflow scheduling on volunteer computing systems
    Ghafarian, Toktam
    Javadi, Bahman
    FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2015, 51 : 87 - 97
  • [33] Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers
    Cheng, Long
    Wang, Ying
    Liu, Qingzhi
    Epema, Dick H. J.
    Liu, Cheng
    Mao, Ying
    Murphy, John
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (06) : 1494 - 1510
  • [34] Data Volume-aware Computation Task Scheduling for Smart Grid Data Analytic Applications
    Guo, Binquan
    Li, Hongyan
    Yan, Ye
    Zhang, Zhou
    Wang, Peng
    ICC 2023-IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS, 2023, : 4113 - 4118
  • [35] Adaptive divisible load model for scheduling data-intensive grid applications
    Othman, M.
    Abdullah, M.
    Ibrahim, H.
    Subramaniam, S.
    COMPUTATIONAL SCIENCE - ICCS 2007, PT 1, PROCEEDINGS, 2007, 4487 : 446 - +
  • [36] A Distributed Cross-Entropy Ant Algorithm for Network-Aware Grid Scheduling
    Yi, Hu
    Bin, Gong
    JCPC: 2009 JOINT CONFERENCE ON PERVASIVE COMPUTING, 2009, : 253 - 256
  • [37] A grid resource broker with network bandwidth-aware job scheduling for computational grids
    Yang, Chao-Tung
    Chen, Sung-Yi
    Chen, Tsui-Ting
    ADVANCES IN GRID AND PERVASIVE COMPUTING, PROCEEDINGS, 2007, 4459 : 1 - +
  • [38] Network Latency and Application Performance Aware Cluster Scheduling in Data Centers
    Popescu, Diana Andreea
    Moore, Andrew W.
    IEEE NETWORK, 2022, 36 (02): : 58 - 65
  • [39] An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters
    肖鹏
    胡志刚
    张艳平
    Journal of Computer Science & Technology, 2013, 28 (06) : 948 - 961
  • [40] An Energy-Aware Heuristic Scheduling for Data-Intensive Workflows in Virtualized Datacenters
    Peng Xiao
    Zhi-Gang Hu
    Yan-Ping Zhang
    Journal of Computer Science and Technology, 2013, 28 : 948 - 961