Effective Automatic Parallelization of Stencil Computations

被引：61

作者：

Krishnamoorthy, Sriram ^{[1
]}

Baskaran, Muthu ^{[1
]}

Bondhugula, Uday ^{[1
]}

Ramanujam, J.

Rountev, Atanas ^{[1
]}

Sadayappan, P. ^{[1
]}

机构：

[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA

来源：

PLDI'07: PROCEEDINGS OF THE 2007 ACM SIGPLAN CONFERENCE ON PROGRAMMING LANGUAGE DESIGN AND IMPLEMENTATION | 2007年

关键词：

Stencil computations; Tiling; Automatic parallelization; Load balance;

D O I：

10.1145/1250734.1250761

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

Performance optimization of stencil computations has been widely studied in the literature, since they occur in many computationally intensive scientific and engineering applications. Compiler frameworks have also been developed that can transform sequential stencil codes for optimization of data locality and parallelism. However, loop skewing is typically required in order to the stencil codes along the time dimension, resulting in load imbalance in pipelined parallel execution of the tiles. In this paper, we develop an approach for automatic parallelization of stencil codes, that explicitly addresses the issue of load-balanced execution of tiles. Experimental results are provided that demonstrate the effectiveness of the approach.

引用

页码：235 / 244

页数：10

共 50 条

[31] Parameterized Diamond Tiling for Parallelizing Stencil Computations
Wijesinghe, T.
Senevirathne, K.
Siriwardhana, C.
Visitha, W.
Jayasena, S.
Rusira, T.
Hall, M.
[J]. 2017 3RD INTERNATIONAL MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON), 2017, : 99 - 104
[32] Autotuning divide-and-conquer stencil computations
Natarajan, Ekanathan Palamadai
Dehnavi, Maryam Mehri
Leiserson, Charles
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (17):
[33] Modeling Stencil Computations on Modern HPC Architectures
de la Cruz, Raul
Araya-Polo, Mauricio
[J]. HIGH PERFORMANCE COMPUTING SYSTEMS: PERFORMANCE MODELING, BENCHMARKING, AND SIMULATION, 2015, 8966 : 149 - 171
[34] Autotuning Stencil-Based Computations on GPUs
Mametjanov, Azamat
Lowell, Daniel
Ma, Ching-Chen
Norris, Boyana
[J]. 2012 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2012, : 266 - 274
[35] The memory behavior of cache oblivious stencil computations
Frigo, Matteo
Strumpen, Volker
[J]. JOURNAL OF SUPERCOMPUTING, 2007, 39 (02): : 93 - 112
[36] Effective Automatic Computation Placement and Data Allocation for Parallelization of Regular Programs
Reddy, Chandan
Bondhugula, Uday
[J]. PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 13 - 22
[37] Speeding Up Stencil Computations with Kernel Convolution
Januario, Guilherme C.
Rosenburg, Bryan S.
Park, Yoonho
Perrone, Michael
Moreira, Jose
Carvalho, Tereza C. M. B.
[J]. PROCEEDINGS OF 28TH IEEE INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, (SBAC-PAD 2016), 2016, : 76 - 83
[38] Framework for Automatic Parallelization
Anala, M. R.
Dash, Deepika
[J]. 2018 IEEE 25TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING WORKSHOPS (HIPCW), 2018, : 112 - 118
[39] Automatic Parallelization Tools
Qian, Ying
[J]. WORLD CONGRESS ON ENGINEERING AND COMPUTER SCIENCE, WCECS 2012, VOL I, 2012, : 97 - 101
[40] Automatic parallelization with pMapper
Travinin, Nadya
Hoffmann, Henry
Bond, Robert
Chan, Hector
Kepner, Jeremy
Wong, Edmund
[J]. 2005 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER), 2006, : 483 - +

← 1 2 3 4 5 →