High Performance Stencil Code Algorithms for GPGPUs

被引:35
|
作者
Schaefer, Andreas [1 ]
Fey, Dietmar [1 ]
机构
[1] Univ Erlangen Nurnberg, Chair Comp Sci Comp Architecture 3, D-91054 Erlangen, Germany
关键词
stencil codes; GPU; high performance computing; temporal blocking; Jacobi solver; CUDA;
D O I
10.1016/j.procs.2011.04.221
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper we investigate how stencil computations can be implemented on state-of-the-art general purpose graphics processing units (GPGPUs). Stencil codes can be found at the core of many numerical solvers and physical simulation codes and are therefore of particular interest to scientific computing research. GPGPUs have gained a lot of attention recently because of their superior floating point performance and memory bandwidth. Nevertheless, especially memory bound stencil codes have proven to be challenging for GPGPUs, yielding lower than to be expected speedups. We chose the Jacobi method as a standard benchmark to evaluate a set of algorithms on NVIDIA's latest Fermi chipset. One of our fastest algorithms is a parallel wavefront update. It exploits the enlarged on-chip shared memory to perform two time step updates per sweep. To the best of our knowledge, it represents the first successful application of temporal blocking for 3D stencils on GPGPUs and thereby exceeds previous results by a considerable margin. It is also the first paper to study stencil codes on Fermi.
引用
收藏
页码:2027 / 2036
页数:10
相关论文
共 50 条
  • [21] Landing Stencil Code on Godson-T
    崔慧敏
    王蕾
    范东睿
    冯晓兵
    JournalofComputerScience&Technology, 2010, 25 (04) : 886 - 894
  • [22] Porting a Legacy CUDA Stencil Code to oneAPI
    Christgau, Steffen
    Zuse, Thomas Steink
    2020 IEEE 34TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW 2020), 2020, : 359 - 367
  • [23] Radiation Sensitivity of High Performance Computing Applications on Kepler-Based GPGPUs
    Oliveira, Daniel A. G.
    Lunardi, Caio B.
    Pilla, Laercio L.
    Rech, Paolo
    Navaux, Philippe O. A.
    Carro, Luigi
    2014 44TH ANNUAL IEEE/IFIP INTERNATIONAL CONFERENCE ON DEPENDABLE SYSTEMS AND NETWORKS (DSN), 2014, : 732 - 737
  • [24] Enabling efficient stencil code generation in OpenACC
    Pereira, Alyson D.
    Rocha, Rodrigo C. O.
    Castro, Marcio
    Goes, Luis F. W.
    Dantas, Mario A. R.
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE (ICCS 2017), 2017, 108 : 2333 - 2337
  • [25] ExaStencils: Advanced Stencil-Code Engineering
    Lengauer, Christian
    Apel, Sven
    Bolten, Matthias
    Groesslinger, Armin
    Hannig, Frank
    Koestler, Harald
    Ruede, Ulrich
    Teich, Juergen
    Grebhahn, Alexander
    Kronawitter, Stefan
    Kuckuk, Sebastian
    Rittich, Hannah
    Schmitt, Christian
    EURO-PAR 2014: PARALLEL PROCESSING WORKSHOPS, PT II, 2014, 8806 : 553 - 564
  • [26] Optimizing Stencil Code via Locality of Computation
    Luo, Yulong
    Tan, Guangming
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 477 - 478
  • [27] Landing stencil code on godson-T
    Key Laboratory of Computer System and Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
    不详
    J Comput Sci Technol, 4 (886-894):
  • [28] High Performance Stencil Computations for Intel® Xeon Phi™ Coprocessor
    Feng, Luxia
    Dong, Yushan
    Li, Chunjiang
    Jiang, Hao
    ADVANCED COMPUTER ARCHITECTURE, ACA 2016, 2016, 626 : 108 - 117
  • [29] Landing Stencil Code on Godson-T
    Hui-Min Cui
    Lei Wang
    Dong-Rui Fan
    Xiao-Bing Feng
    Journal of Computer Science and Technology, 2010, 25 : 886 - 894
  • [30] Landing Stencil Code on Godson-T
    Cui, Hui-Min
    Wang, Lei
    Fan, Dong-Rui
    Feng, Xiao-Bing
    JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2010, 25 (04) : 886 - 894