Generating efficient tiled code for distributed memory machines

被引:9
|
作者
Tang, PY
Xue, JL [1 ]
机构
[1] Univ New England, Sch Comp Sci & Engn, Sydney, NSW 2351, Australia
[2] Univ So Queensland, Dept Math & Comp, Toowoomba, Qld 4350, Australia
基金
澳大利亚研究理事会;
关键词
nested loops; tiling; distributed memory; machines; SPMD; memory optimisations;
D O I
10.1016/S0167-8191(00)00040-5
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Tiling can improve the performance of nested loops on distributed memory machines by exploiting coarse-grain parallelism and reducing communication overhead and frequency. Tiling calls for a compilation approach that performs first computation distribution and then data distribution, both possibly on a skewed iteration space. This paper presents a suite of compiler techniques for generating efficient SPMD programs to execute rectangularly tiled iteration spaces on distributed memory machines. The following issues are addressed: computation and data distribution, message-passing code generation, memory management and optimisations. and global to local address translation. Methods are developed for partitioning arbitrary iteration spaces a:nd skewed data spaces. Techniques for generating efficient message-passing code for both arbitrary and rectangular iteration spaces are presented. A storage scheme for managing both local and nonlocal references is developed, which leads to the SPMD code with high locality of references. Two memory optimisations are given to reduce the amount of memory usage for skewed iteration spaces and expanded arrays, respectively. The proposed compiler techniques are illustrated using a simple running example and finally analysed and evaluated based on experimental results on a Fujitsu AP1000 consisting of 128 processors. (C) 2000 Published by Elsevier Science B.V, All rights reserved.
引用
收藏
页码:1369 / 1410
页数:42
相关论文
共 50 条
  • [1] Generating Efficient Data Movement Code for Heterogeneous Architectures with Distributed-Memory
    Dathathri, Roshan
    Reddy, Chandan
    Ramashekar, Thejas
    Bondhugula, Uday
    [J]. 2013 22ND INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT), 2013, : 375 - 386
  • [2] Eliminating redundant communication of code generation for distributed memory machines
    Shen, Ya Nan
    Zhao, Rong Cai
    Wang, Lei
    [J]. SNPD 2007: EIGHTH ACIS INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING, ARTIFICIAL INTELLIGENCE, NETWORKING, AND PARALLEL/DISTRIBUTED COMPUTING, VOL 2, PROCEEDINGS, 2007, : 751 - +
  • [3] COMMUNICATION OPTIMIZATION AND CODE GENERATION FOR DISTRIBUTED-MEMORY MACHINES
    AMARASINGHE, SP
    LAM, MS
    [J]. SIGPLAN NOTICES, 1993, 28 (06): : 126 - 138
  • [4] Maple programs for generating efficient FORTRAN code for serial and vectorised machines
    Gomez, C
    Scott, T
    [J]. COMPUTER PHYSICS COMMUNICATIONS, 1998, 115 (2-3) : 548 - 562
  • [5] Generating Multibillion Element Unstructured Meshes on Distributed Memory Parallel Machines
    Soner, Seren
    Ozturan, Can
    [J]. SCIENTIFIC PROGRAMMING, 2015, 2015
  • [6] PaKman: A Scalable Algorithm for Generating Genomic Contigs on Distributed Memory Machines
    Ghosh, Priyanka
    Krishnamoorthy, Sriram
    Kalyanaraman, Ananth
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (05) : 1191 - 1209
  • [7] EFFICIENT SUPPORT FOR IRREGULAR APPLICATIONS ON DISTRIBUTED-MEMORY MACHINES
    MUKHERJEE, SS
    SHARMA, SD
    HILL, MD
    LARUS, JR
    ROGERS, A
    SALTZ, J
    [J]. SIGPLAN NOTICES, 1995, 30 (08): : 68 - 79
  • [8] Efficient techniques for performing an irregular computation on distributed memory machines
    Shah, HV
    Fortes, JAB
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 1997, 28 (11) : 1101 - 1113
  • [9] An efficient code generation technique for tiled iteration spaces
    Goumas, G
    Athanasaki, M
    Koziris, N
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2003, 14 (10) : 1021 - 1034
  • [10] Compiling array expressions for efficient execution on distributed-memory machines
    Gupta, SKS
    Kaushik, SD
    Huang, CH
    Sadayappan, P
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1996, 32 (02) : 155 - 172