The memory behavior of cache oblivious stencil computations

被引：0

作者：

Matteo Frigo

Volker Strumpen

机构：

[1] IBM Austin Research Laboratory,

来源：

The Journal of Supercomputing | 2007年 / 39卷

关键词：

Cache oblivious algorithms; Stencil computations; Analysis of algorithms; Performance analysis; System simulation;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

We present and evaluate a cache oblivious algorithm for stencil computations, which arise for example in finite-difference methods. Our algorithm applies to arbitrary stencils in n-dimensional spaces. On an “ideal cache” of size Z, our algorithm saves a factor of Θ(Z1/n) cache misses compared to a naive algorithm, and it exploits temporal locality optimally throughout the entire memory hierarchy. We evaluate our algorithm in terms of the number of cache misses, and demonstrate that the memory behavior agrees with our theoretical predictions. Our experimental evaluation is based on a finite-difference solution of a heat diffusion problem, as well as a Gauss-Seidel iteration and a 2-dimensional LBMHD program, both reformulated as cache oblivious stencil computations.

引用

页码：93 / 112

页数：19

共 50 条

[1] The memory behavior of cache oblivious stencil computations
Frigo, Matteo
Strumpen, Volker
[J]. JOURNAL OF SUPERCOMPUTING, 2007, 39 (02): : 93 - 112
[2] Quantifying Performance Bottlenecks of Stencil Computations Using the Execution-Cache-Memory Model
Stengel, Holger
Treibig, Jan
Hager, Georg
Wellein, Gerhard
[J]. PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING (ICS'15), 2015, : 207 - 216
[3] Casper: Accelerating Stencil Computations Using Near-Cache Processing
Denzler, Alain
Oliveira, Geraldo F.
Hajinazar, Nastaran
Bera, Rahul
Singh, Gagandeep
Gomez-Luna, Juan
Mutlu, Onur
[J]. IEEE ACCESS, 2023, 11 : 22136 - 22154
[4] Cache-Oblivious Algorithms and Matrix Formats for Computations on Interval Matrices
Dabrowski, Rafal
Kubica, Bartlomiej Jacek
[J]. APPLIED PARALLEL AND SCIENTIFIC COMPUTING, PT II, 2012, 7134 : 269 - 279
[5] PIMS: A Lightweight Processing-in-Memory Accelerator for Stencil Computations
Li, Jie
Wang, Xi
Tumeo, Antonino
Williams, Brody
Leidel, John D.
Chen, Yong
[J]. MEMSYS 2019: PROCEEDINGS OF THE INTERNATIONAL SYMPOSIUM ON MEMORY SYSTEMS, 2019, : 41 - 52
[6] A Distributed Memory Based Embedded CGRA for Accelerating Stencil Computations
Takeuchi, Shohei
Yuttakonkit, Yuttakon
Takamaeda-Yamazaki, Shinya
Nakashima, Yasuhiko
[J]. PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2015, : 385 - 391
[7] Multidimensional Intratile Parallelization for Memory-Starved Stencil Computations
Malas, Tareq M.
Hager, Georg
Ltaief, Hatem
Keyes, David E.
[J]. ACM TRANSACTIONS ON PARALLEL COMPUTING, 2018, 4 (03)
[8] Cache oblivious algorithms
Kumar, P
[J]. ALGORITHMS FOR MEMORY HIERARCHIES: ADVANCED LECTURES, 2003, 2625 : 193 - 212
[9] Applying Recursive Temporal Blocking for Stencil Computations to Deeper Memory Hierarchy
Endo, Toshio
[J]. 2018 7TH IEEE NON-VOLATILE MEMORY SYSTEMS AND APPLICATIONS SYMPOSIUM (NVMSA 2018), 2018, : 19 - 24
[10] Evaluating optimizations that reduce global memory accesses of stencil computations in GPGPUs
Nasciutti, Thiago Carrijo
Panetta, Jairo
Lopes, Pedro Pais
[J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2019, 31 (18):

← 1 2 3 4 5 →