Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations

被引:42
|
作者
Bondhugula, Uday [1 ]
Bandishti, Vinayaka [1 ]
Pananilath, Irshad [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
关键词
Compilers; program transformation; loop tiling; parallelism; locality; stencils;
D O I
10.1109/TPDS.2016.2615094
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Most stencil computations allow tile-wise concurrent start, i.e., there always exists a face of the iteration space and a set of tiling directions such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. However, existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique, called diamond tiling, that ensures concurrent start-up as well as perfect load-balance whenever possible. We first provide necessary and sufficient conditions for a set of tiling hyperplanes to allow concurrent start for programs with affine data accesses. We then provide an approach to automatically find such hyperplanes. Experimental evaluation on a 12-core Intel Westmere shows that diamond tiled code is able to outperform a tuned domain-specific stencil code generator by 10 to 40 percent, and previous compiler techniques by a factor of 1.3x to 10.1x.
引用
收藏
页码:1285 / 1298
页数:14
相关论文
共 36 条
  • [21] Tiling and Optimizing Time-Iterated Computations over Periodic Domains
    Bondhugula, Uday
    Bandishti, Vinayaka
    Cohen, Albert
    Potron, Guillain
    Vasilache, Nicolas
    [J]. PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 39 - 50
  • [22] Computational techniques for automatically tiling and skinning branched objects
    Marsan, AL
    Dutta, D
    [J]. COMPUTERS & GRAPHICS-UK, 1999, 23 (01): : 111 - 126
  • [23] Jagged Tiling for Intra-tile Parallelism and Fine-Grain Multithreading
    Shrestha, Sunil
    Manzano, Joseph
    Marquez, Andres
    Feo, John
    Gao, Guang R.
    [J]. LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING (LCPC 2014), 2015, 8967 : 161 - 175
  • [24] Loop Tiling in Large-Scale Stencil Codes at Run-Time with OPS
    Reguly, Istvan Z.
    Mudalige, Gihan R.
    Giles, Michael B.
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (04) : 873 - 886
  • [25] New tiling techniques to improve cache temporal locality
    Song, YH
    Li, ZY
    [J]. ACM SIGPLAN NOTICES, 1999, 34 (05) : 215 - 228
  • [26] Computational techniques for automatically tiling and skinning branched objects
    Marsan, Anne L.
    Dutta, Debasish
    [J]. Computers and Graphics (Pergamon), 23 (01): : 111 - 126
  • [27] CONDUCTION IN A 2-PHASE PLANE WITH DIAMOND-SHAPED TILING
    HELSING, J
    GRIMVALL, G
    BAO, KD
    [J]. JOURNAL OF MATHEMATICAL PHYSICS, 1991, 32 (07) : 1958 - 1960
  • [28] Dynamic pointer alignment: Tiling and communication optimizations for parallel pointer-based computations
    Zhang, XB
    Chien, AA
    [J]. ACM SIGPLAN NOTICES, 1997, 32 (07) : 37 - 47
  • [29] Development of epitaxial, tiling, and cutting processes for a diamond single crystal wafer technology
    Posthill, JB
    Malta, DP
    Humphreys, TP
    Hudson, GG
    Thomas, RE
    Rudder, RA
    Markunas, RJ
    [J]. DIAMOND FOR ELECTRONIC APPLICATIONS, 1996, 416 : 45 - 49
  • [30] In-Cache MapReduce: Leverage Tiling to Boost Temporal Locality-Sensitive MapReduce Computations
    Magro, Daniel
    Paulino, Herve
    [J]. 2016 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2016, : 374 - 383