SUPPLE: An efficient run-time support for non-uniform parallel loops

被引:0
|
作者
Orlando, S
Perego, R
机构
[1] CNR, CNUCE, I-56126 Pisa, Italy
[2] Univ Ca Foscari Venezia, Dipartimento Matemat Applicata & Informat, I-30173 Venezia Mestre, Italy
关键词
data parallelism; parallel loop scheduling; load balancing; run-time supports; compiler optimizations;
D O I
10.1016/S1383-7621(98)00071-X
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This paper presents SUPPLE (SUPort for Parallel Loop Execution), an innovative run-time support for the execution of parallel loops with regular stencil data references and non-uniform iteration costs. SUPPLE relies upon a static block data distribution to exploit locality, and combines static and dynamic policies for scheduling non-uniform iterations. It adopts, as far as possible, a static scheduling policy derived from the owner computes rule, and moves data and iterations among processors only if a load imbalance actually occurs. SUPPLE always tries to overlap communications with useful computations by reordering loop iterations and prefetching remote ones in the case of workload imbalance. The SUPPLE approach has been validated by many experimental results obtained by running a multidimensional flame simulation kernel on a 64-node Gray T3D. We have fed the benchmark code with several synthetic input data sets built on the basis of a load imbalance model. We have compared our results with those obtained with a CRAFT Fortran implementation of the benchmark. (C) 1999 Elsevier Science B.V. All rights reserved.
引用
收藏
页码:1323 / 1343
页数:21
相关论文
共 50 条
  • [21] Object-oriented run-time support for data-parallel applications
    Bi, H
    Kessler, M
    Wilhelmi, M
    [J]. COMPUTING IN OBJECT-ORIENTED PARALLEL ENVIRONMENTS, 1998, 1505 : 175 - 182
  • [22] Cellflow: a Parallel Application Development Environment with Run-Time Support for the Cell BE Processor
    Ruggiero, Martino
    Lombardi, Michele
    Milano, Michela
    Benini, Luca
    [J]. 11TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN - ARCHITECTURES, METHODS AND TOOLS : DSD 2008, PROCEEDINGS, 2008, : 645 - 650
  • [23] Portable run-time support for dynamic object-oriented parallel processing
    Grimshaw, AS
    Weissman, JB
    Strayer, WT
    [J]. ACM TRANSACTIONS ON COMPUTER SYSTEMS, 1996, 14 (02): : 139 - 170
  • [24] High-Performance Parallel Accelerator for Flexible and Efficient Run-Time Monitoring
    Deng, Daniel Y.
    Suh, G. Edward
    [J]. 2012 42ND ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2012,
  • [25] AN APPROACH TO THE RUN-TIME MONITORING OF PARALLEL PROGRAMS
    CAI, WT
    TURNER, SJ
    [J]. COMPUTER JOURNAL, 1994, 37 (04): : 333 - 345
  • [26] THE RUN-TIME EFFICIENCY OF PARALLEL ASYNCHRONOUS ALGORITHMS
    DUBOIS, M
    BRIGGS, FA
    [J]. IEEE TRANSACTIONS ON COMPUTERS, 1991, 40 (11) : 1260 - 1266
  • [27] Efficient routing in non-uniform DHTs for range query support
    Abdallah, Maha
    Buyukkaya, Eliya
    [J]. PROCEEDINGS OF THE 18TH IASTED INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED COMPUTING AND SYSTEMS, 2006, : 239 - +
  • [28] Introduction: Techniques and tools for parallel and distributed program analysis, development and run-time support
    Di Martino, B
    Mazzeo, A
    [J]. JOURNAL OF SUPERCOMPUTING, 2000, 17 (03): : 243 - 244
  • [29] An evaluation of an FPGA run-time support system
    Green, P
    Vakondios, M
    Edwards, M
    [J]. EUROMICRO SYMPOSIUM ON DIGITAL SYSTEM DESIGN, PROCEEDINGS: ARCHITECTURES, METHODS AND TOOLS, 2002, : 299 - 306
  • [30] RunAssert: A Non-Intrusive Run-Time Assertion for Parallel Programs Debugging
    Wen, Chi-Neng
    Chou, Shu-Hsuan
    Chen, Tien-Fu
    Lin, Tay-Jyi
    [J]. 2010 DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2010), 2010, : 287 - 290