Diamond Tiling: Tiling Techniques to Maximize Parallelism for Stencil Computations

被引：42

作者：

Bondhugula, Uday ^{[1
]}

Bandishti, Vinayaka ^{[1
]}

Pananilath, Irshad ^{[1
]}

机构：

[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India

来源：

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS | 2017年 / 28卷 / 05期

关键词：

Compilers; program transformation; loop tiling; parallelism; locality; stencils;

D O I：

10.1109/TPDS.2016.2615094

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Most stencil computations allow tile-wise concurrent start, i.e., there always exists a face of the iteration space and a set of tiling directions such that all tiles along that face can be started concurrently. This provides load balance and maximizes parallelism. However, existing automatic tiling frameworks often choose hyperplanes that lead to pipelined start-up and load imbalance. We address this issue with a new tiling technique, called diamond tiling, that ensures concurrent start-up as well as perfect load-balance whenever possible. We first provide necessary and sufficient conditions for a set of tiling hyperplanes to allow concurrent start for programs with affine data accesses. We then provide an approach to automatically find such hyperplanes. Experimental evaluation on a 12-core Intel Westmere shows that diamond tiled code is able to outperform a tuned domain-specific stencil code generator by 10 to 40 percent, and previous compiler techniques by a factor of 1.3x to 10.1x.

引用

页码：1285 / 1298

页数：14

共 35 条

[1] Tiling Stencil Computations to Maximize Parallelism
Bandishti, Vinayaka
Pananilath, Irshad
Bondhugula, Uday
[J]. 2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
[2] Parameterized Diamond Tiling for Parallelizing Stencil Computations
Wijesinghe, T.
Senevirathne, K.
Siriwardhana, C.
Visitha, W.
Jayasena, S.
Rusira, T.
Hall, M.
[J]. 2017 3RD INTERNATIONAL MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON), 2017, : 99 - 104
[3] Parameterized Diamond Tiling for Stencil Computations with Chapel parallel iterators
Bertolacci, Ian J.
Olschanowsky, Catherine
Harshbarger, Ben
Chamberlain, Bradford L.
Wonnacott, David G.
Strout, Michelle Mills
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 197 - 206
[4] TOAST: Automatic tiling for iterative stencil computations on GPUs
Rocha, Rodrigo C. O.
Pereira, Alyson D.
Ramos, Luiz
Goes, Luis F. W.
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (08):
[5] Revisiting split tiling for stencil computations in polyhedral compilation
Li, Yingying
Sun, Huihui
Pang, Jianmin
[J]. JOURNAL OF SUPERCOMPUTING, 2022, 78 (01): : 440 - 470
[6] Revisiting split tiling for stencil computations in polyhedral compilation
Yingying Li
Huihui Sun
Jianmin Pang
[J]. The Journal of Supercomputing, 2022, 78 : 440 - 470
[7] Tiling Optimizations for Stencil Computations Using Rewrite Rules in LIFT
Stoltzfus, Larisa
Hagedorn, Bastian
Steuwer, Michel
Gorlatch, Sergei
Dubach, Christophe
[J]. ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION, 2019, 16 (04)
[8] The Relation Between Diamond Tiling and Hexagonal Tiling
Grosser, Tobias
Verdoolaege, Sven
Cohen, Albert
Sadayappan, P.
[J]. PARALLEL PROCESSING LETTERS, 2014, 24 (03)
[9] Automatic tiling of iterative stencil loops
Li, ZY
Song, YH
[J]. ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, 2004, 26 (06): : 975 - 1028
[10] Loop tiling for optimization of locality and parallelism
Liu, Song
Wu, Weiguo
Zhao, Bo
Jiang, Qing
[J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2015, 52 (05): : 1160 - 1176

← 1 2 3 4 →