Simultaneous scheduling of replication and computation for data-intensive applications on the grid

被引:1
|
作者
Desprez F. [1 ]
Vernois A. [1 ]
机构
[1] LIP Laboratory/GRAAL Project, UMR CNRS, Univ. Claude Bernard Lyon 1, F-69364 Lyon Cedex 07
关键词
Bioinformatics applications; Data management; Grid computing; Scheduling;
D O I
10.1007/s10723-005-9016-2
中图分类号
学科分类号
摘要
Managing large datasets has become one major application of Grids. Life science applications usually manage large databases that should be replicated to scale applications. The growing number of users and the simple access to Internet-based application has stressed Grid middleware. Such environment are thus asked to manage data and schedule computation tasks at the same time. These two important operations have to be tightly coupled. This paper presents an algorithm (Scheduling and Replication Algorithm, SRA) that combines data management and scheduling using a steady-state approach. Using a model of the platform, the number of requests as well as their distribution, the number and size of databases, we define a linear program to satisfy all the constraints at every level of the platform in steady-state. The solution of this linear program will give us a placement for the databases on the servers as well as providing, for each kind of job, the server on which they should be executed. Our theoretical results are validated using simulation and logs from a large life science application. © Springer Science+Business Media, Inc. 2006.
引用
收藏
页码:19 / 31
页数:12
相关论文
共 50 条
  • [1] Decoupling computation and data scheduling in distributed data-intensive applications
    Ranganathan, K
    Foster, I
    [J]. 11TH IEEE INTERNATIONAL SYMPOSIUM ON HIGH PERFORMANCE DISTRIBUTED COMPUTING, PROCEEDINGS, 2002, : 352 - 358
  • [2] Simultaneous scheduling of replication and computation for bioinformatic applications on the Grid
    Desprez, F
    Vernois, A
    Blanchet, C
    [J]. BIOLOGICAL AND MEDICAL DATA ANALYSIS, PROCEEDINGS, 2005, 3745 : 262 - 273
  • [3] Simultaneous scheduling of replication and computation for bioinformatic applications on the grid
    Desprez, F
    Vernois, A
    [J]. CLADE 2005: CHALLENGES OF LARGE APPLICATIONS IN DISTRIBUTED ENVIRONMENTS, PROCEEDINGS, 2005, : 66 - 74
  • [4] Multi-Replication with Intelligent Staging in Data-Intensive Grid Applications
    Machida, Yuya
    Takizawa, Shin'ichiro
    Nakada, Hidemoto
    Matsuoka, Satoshi
    [J]. 2006 7TH IEEE/ACM INTERNATIONAL CONFERENCE ON GRID COMPUTING, 2006, : 88 - +
  • [5] Adaptive divisible load model for scheduling data-intensive grid applications
    Othman, M.
    Abdullah, M.
    Ibrahim, H.
    Subramaniam, S.
    [J]. COMPUTATIONAL SCIENCE - ICCS 2007, PT 1, PROCEEDINGS, 2007, 4487 : 446 - +
  • [6] Data replication techniques for data-intensive applications
    No, Jaechun
    Park, Chang Won
    Park, Sung Soon
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, 2006, 3994 : 1063 - 1070
  • [7] Heuristic-based scheduling to maximize throughput of data-intensive grid applications
    Ray, S
    Zhang, Z
    [J]. DISTRIBUTED COMPUTING - IWDC 2004, PROCEEDINGS, 2004, 3326 : 63 - 74
  • [8] New worker-centric scheduling strategies for data-intensive grid applications
    Ko, Steven Y.
    Morales, Ramses
    Gupta, Indranil
    [J]. MIDDLEWARE 2007, PROCEEDINGS, 2007, 4834 : 121 - 142
  • [9] A Data-Intensive Workflow Scheduling Algorithm for Grid Computing
    Xu, Meng
    Cui, Lizhen
    Wang, Haiyang
    Bi, Yanbing
    Bian, Ji
    [J]. FOURTH CHINAGRID ANNUAL CONFERENCE, PROCEEDINGS, 2009, : 110 - 115
  • [10] AxBy: Approximate Computation Bypass for Data-Intensive Applications
    Ma, Dongning
    Jiao, Xun
    [J]. 2020 23RD EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD 2020), 2020, : 332 - 339