Multi-Replication with Intelligent Staging in Data-Intensive Grid Applications

被引:0
|
作者
Machida, Yuya [1 ]
Takizawa, Shin'ichiro [1 ]
Nakada, Hidemoto [1 ,2 ]
Matsuoka, Satoshi [1 ,3 ]
机构
[1] Tokyo Inst Technol, 2-12-1 Ookayama, Tokyo 1528550, Japan
[2] Natl Inst Adv Ind Sci & Technol, Tsukuba, Ibaraki 3058568, Japan
[3] Natl Inst Informat, Tokyo 1018430, Japan
关键词
D O I
10.1109/ICGRID.2006.311002
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Existing data grid scheduling systems handle huge data I/O via replica location services coupled with simple staging, decoupled from scheduling of computing tasks. However, when the application/workflow scales, we observe considerable degradations in performance, compared to processing within a tightly-coupled cluster. For example, when numerous nodes access the same set of files simultaneously, major performance degradation occurs even if replicas are used, due to bottlenecks that manifest in the infrastructure. Instead of resorting to expensive solutions such as parallel Me systems, we propose alleviating the situation by tightly coupling replica and data transfer management with computation scheduling. In particular we propose three techniques: (1) dynamic aggregation and 0(l) replication of data-staging requests across multiple nodes using a multi-replication framework, (2) replica-centric scheduling data re-use and time-to-replication as compute scheduling metrics on the grid and (3) overlapped execution of data staging and compute bound tasks. Early benchmark results implemented in our prototype Condor-like grid scheduling system demonstrate that the techniques are quite effective in eliminating much of the overhead in data transfers in many cases.
引用
收藏
页码:88 / +
页数:3
相关论文
共 50 条
  • [1] A scalable multi-replication framework for data grid
    Takizawa, S
    Takamiya, Y
    Nakada, H
    Matsuoka, S
    [J]. 2005 SYMPOSIUM ON APPLICATIONS AND THE INTERNET WORKSHOPS, PROCEEDINGS, 2005, : 310 - 315
  • [2] Simultaneous scheduling of replication and computation for data-intensive applications on the grid
    Desprez F.
    Vernois A.
    [J]. Journal of Grid Computing, 2006, 4 (1) : 19 - 31
  • [3] Data replication techniques for data-intensive applications
    No, Jaechun
    Park, Chang Won
    Park, Sung Soon
    [J]. COMPUTATIONAL SCIENCE - ICCS 2006, PT 4, PROCEEDINGS, 2006, 3994 : 1063 - 1070
  • [4] Towards a Replication Service for Data-Intensive Fog Applications
    Hasenburg, Jonathan
    Grambow, Martin
    Bermbach, David
    [J]. PROCEEDINGS OF THE 35TH ANNUAL ACM SYMPOSIUM ON APPLIED COMPUTING (SAC'20), 2020, : 267 - 270
  • [5] MAPFS-Grid:: A flexible architecture for data-intensive grid applications
    Pérez, MS
    Carretero, J
    García, F
    Peña, JM
    Robles, V
    [J]. GRID COMPUTING, 2004, 2970 : 111 - 118
  • [6] An intelligent memory caching architecture for data-intensive multimedia applications
    Aaqif Afzaal Abbasi
    Sameen Javed
    Shahaboddin Shamshirband
    [J]. Multimedia Tools and Applications, 2021, 80 : 16743 - 16761
  • [7] An intelligent memory caching architecture for data-intensive multimedia applications
    Abbasi, Aaqif Afzaal
    Javed, Sameen
    Shamshirband, Shahaboddin
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16743 - 16761
  • [8] Intelligent data staging with overlapped execution of grid applications
    Machida, Yuya
    Takizawa, Shin'ichiro
    Nakada, Hidemoto
    Matsuoka, Satoshi
    [J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2008, 24 (05): : 425 - 433
  • [9] Exploiting replication and data reuse to efficiently schedule data-intensive applications on grids
    Santos-Neto, E
    Cirne, W
    Brasileiro, F
    Lima, A
    [J]. JOB SCHEDULING STRATEGIES FOR PARALLEL PROCESSING, 2005, 3277 : 210 - 232
  • [10] A prediction-based dynamic replication strategy for data-intensive applications
    Nagarajan, Vijaya
    Mohamed, Mulk Abdul Maluk
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2017, 57 : 281 - 293