Effective Automatic Parallelization of Stencil Computations

被引:61
|
作者
Krishnamoorthy, Sriram [1 ]
Baskaran, Muthu [1 ]
Bondhugula, Uday [1 ]
Ramanujam, J.
Rountev, Atanas [1 ]
Sadayappan, P. [1 ]
机构
[1] Ohio State Univ, Dept Comp Sci & Engn, Columbus, OH 43210 USA
关键词
Stencil computations; Tiling; Automatic parallelization; Load balance;
D O I
10.1145/1250734.1250761
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Performance optimization of stencil computations has been widely studied in the literature, since they occur in many computationally intensive scientific and engineering applications. Compiler frameworks have also been developed that can transform sequential stencil codes for optimization of data locality and parallelism. However, loop skewing is typically required in order to the stencil codes along the time dimension, resulting in load imbalance in pipelined parallel execution of the tiles. In this paper, we develop an approach for automatic parallelization of stencil codes, that explicitly addresses the issue of load-balanced execution of tiles. Experimental results are provided that demonstrate the effectiveness of the approach.
引用
收藏
页码:235 / 244
页数:10
相关论文
共 50 条
  • [1] Effective automatic parallelization of stencil computations
    Krishnamoorthy, Sriram
    Baskaran, Muthu
    Bondhugula, Uday
    Ramanujam, J.
    Rountev, Atanas
    Sadayappan, P.
    [J]. ACM SIGPLAN NOTICES, 2007, 42 (06) : 235 - 244
  • [2] Automatic Adaptive Approximation for Stencil Computations
    Schmitt, Maxime
    Helluy, Philippe
    Bastoul, Cedric
    [J]. PROCEEDINGS OF THE 28TH INTERNATIONAL CONFERENCE ON COMPILER CONSTRUCTION (CC '19), 2019, : 170 - 181
  • [3] Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations
    Malas, Tareq M.
    Hager, Georg
    Ltaief, Hatem
    Keyes, David E.
    [J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018, 4 (03)
  • [4] A Multilevel Parallelization Framework for High-Order Stencil Computations
    Dursun, Hikmet
    Nomura, Ken-ichi
    Peng, Liu
    Seymour, Richard
    Wang, Weiqiang
    Kalia, Rajiv K.
    Nakano, Aiichiro
    Vashishta, Priya
    [J]. EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGS, 2009, 5704 : 642 - 653
  • [5] Automatic Partitioning of Stencil Computations on Heterogeneous Systems
    Pereira, Alyson D.
    Rocha, Rodrigo C. O.
    Ramos, Luiz
    Castro, Marcio
    Goes, Luis F. W.
    [J]. 2017 INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING WORKSHOPS (SBAC-PADW), 2017, : 43 - 48
  • [6] Automatic Performance Tuning of Stencil Computations on GPUs
    Garvey, Joseph D.
    Abdelrahman, Tarek S.
    [J]. 2015 44TH INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING (ICPP), 2015, : 300 - 309
  • [7] Efficient multicore-aware parallelization strategies for iterative stencil computations
    Treibig, Jan
    Wellein, Gerhard
    Hager, Georg
    [J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2011, 2 (02) : 130 - 137
  • [8] Toward an automatic parallelization of sparse matrix computations
    Adle, R
    Aiguier, M
    Delaplace, F
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2005, 65 (03) : 313 - 330
  • [9] TOAST: Automatic tiling for iterative stencil computations on GPUs
    Rocha, Rodrigo C. O.
    Pereira, Alyson D.
    Ramos, Luiz
    Goes, Luis F. W.
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (08):
  • [10] Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters
    Hikmet Dursun
    Manaschai Kunaseth
    Ken-ichi Nomura
    Jacqueline Chame
    Robert F. Lucas
    Chun Chen
    Mary Hall
    Rajiv K. Kalia
    Aiichiro Nakano
    Priya Vashishta
    [J]. The Journal of Supercomputing, 2012, 62 : 946 - 966