Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations

被引:42
|
作者
Bondhugula, Uday [1 ]
Bandishti, Vinayaka [1 ]
Pananilath, Irshad [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
关键词
Compilers; program transformation; loop tiling; parallelism; locality; stencils;
D O I
10.1109/TPDS.2016.2615094
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Most stencil computations allow tile-wise concurrent start, i.e., there always exists a face of the iteration space and a set of tiling directions such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. However, existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique, called diamond tiling, that ensures concurrent start-up as well as perfect load-balance whenever possible. We first provide necessary and sufficient conditions for a set of tiling hyperplanes to allow concurrent start for programs with affine data accesses. We then provide an approach to automatically find such hyperplanes. Experimental evaluation on a 12-core Intel Westmere shows that diamond tiled code is able to outperform a tuned domain-specific stencil code generator by 10 to 40 percent, and previous compiler techniques by a factor of 1.3x to 10.1x.
引用
收藏
页码:1285 / 1298
页数:14
相关论文
共 35 条
  • [1] Tiling Stencil Computations to Maximize Parallelism
    Bandishti, Vinayaka
    Pananilath, Irshad
    Bondhugula, Uday
    [J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [2] Parameterized Diamond Tiling for Parallelizing Stencil Computations
    Wijesinghe, T.
    Senevirathne, K.
    Siriwardhana, C.
    Visitha, W.
    Jayasena, S.
    Rusira, T.
    Hall, M.
    [J]. 2017 3RD INTERNATIONAL MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON), 2017, : 99 - 104
  • [3] Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators
    Bertolacci, Ian J.
    Olschanowsky, Catherine
    Harshbarger, Ben
    Chamberlain, Bradford L.
    Wonnacott, David G.
    Strout, Michelle Mills
    [J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 197 - 206
  • [4] TOAST: Automatic tiling for iterative stencil computations on GPUs
    Rocha, Rodrigo C. O.
    Pereira, Alyson D.
    Ramos, Luiz
    Goes, Luis F. W.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (08):
  • [5] Revisiting split tiling for stencil computations in polyhedral compilation
    Li, Yingying
    Sun, Huihui
    Pang, Jianmin
    [J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (01): : 440 - 470
  • [6] Revisiting split tiling for stencil computations in polyhedral compilation
    Yingying Li
    Huihui Sun
    Jianmin Pang
    [J]. The Journal of Supercomputing, 2022, 78 : 440 - 470
  • [7] Tiling Optimizations for Stencil Computations Using Rewrite Rules in LIFT
    Stoltzfus, Larisa
    Hagedorn, Bastian
    Steuwer, Michel
    Gorlatch, Sergei
    Dubach, Christophe
    [J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2019, 16 (04)
  • [8] The Relation Between Diamond Tiling and Hexagonal Tiling
    Grosser, Tobias
    Verdoolaege, Sven
    Cohen, Albert
    Sadayappan, P.
    [J]. PARALLEL PROCESSING LETTERS, 2014, 24 (03)
  • [9] Automatic tiling of iterative stencil loops
    Li, ZY
    Song, YH
    [J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2004, 26 (06): : 975 - 1028
  • [10] Loop tiling for optimization of locality and parallelism
    Liu, Song
    Wu, Weiguo
    Zhao, Bo
    Jiang, Qing
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (05): : 1160 - 1176