Vectorization and Minimization of Memory Footprint for Linear High-Order Discontinuous Galerkin Schemes

被引:0
|
作者
Gallard, Jean-Matthieu [1 ]
Rannabauer, Leonhard [1 ]
Reinarz, Anne [1 ]
Bader, Michael [1 ]
机构
[1] Tech Univ Munich, Dept Informat, Munich, Germany
关键词
ExaHyPE; Code Generation; High-Order Discontinuous Galerkin; ADER; Hyperbolic PDE Systems; Vectorization; Array-of-Struct-of-Array;
D O I
10.1109/IPDPSW50202.2020.00126
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We present a sequence of optimizations to the performance-critical compute kernels of the high-order discontinuous Galerkin solver of the hyperbolic PDE engine ExaHyPE - successively tackling bottlenecks due to SIMD operations, cache hierarchies and restrictions in the software design. Starting from a generic scalar implementation of the numerical scheme, our first optimized variant applies state-of-the-art optimization techniques by vectorizing loops, improving the data layout and using Loop-over-GEMM to perform tensor contractions via highly optimized matrix multiplication functions provided by the LIBXSMM library. We show that memory stalls due to a memory footprint exceeding our L2 cache size hindered the vectorization gains. We therefore introduce a new kernel that applies a sum factorization approach to reduce the kernel's memory footprint and improve its cache locality. With the L2 cache bottleneck removed, we were able to exploit additional vectorization opportunities, by introducing a hybrid Array-of-Structure-of-Array data layout that solves the data layout conflict between matrix multiplications kernels and the point-wise functions to implement PDE-specific terms. With this last kernel, evaluated in a benchmark simulation at high polynomial order, only 2% of the floating point operations are still performed using scalar instructions and 22.5% of the available performance is achieved.
引用
收藏
页码:711 / 720
页数:10
相关论文
共 50 条
  • [1] Arbitrary High-Order Discontinuous Galerkin Schemes for the Magnetohydrodynamic Equations
    Arne Taube
    Michael Dumbser
    Dinshaw S. Balsara
    Claus-Dieter Munz
    Journal of Scientific Computing, 2007, 30 : 441 - 464
  • [2] Arbitrary high-order discontinuous Galerkin schemes for the magnetohydrodynamic equations
    Taube, Arne
    Dumbser, Michael
    Balsara, Dinshaw S.
    Munz, Claus-Dieter
    JOURNAL OF SCIENTIFIC COMPUTING, 2007, 30 (03) : 441 - 464
  • [3] A reconstruction approach to high-order schemes including discontinuous Galerkin for diffusion
    NASA Glenn Research Center, MS 5-11, Cleveland, OH 44135, United States
    AIAA Aerosp. Sci. Meet. New Horiz. Forum Aerosp. Expos.,
  • [4] ALGEBRAIC MULTIGRID SCHEMES FOR HIGH-ORDER NODAL DISCONTINUOUS GALERKIN METHODS
    Antonietti, Paola F.
    Melas, Laura
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2020, 42 (02): : A1147 - A1173
  • [5] Connections between the discontinuous Galerkin method and high-order flux reconstruction schemes
    De Grazia, D.
    Mengaldo, G.
    Moxey, D.
    Vincent, P. E.
    Sherwin, S. J.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2014, 75 (12) : 860 - 877
  • [6] An efficient sliding mesh interface method for high-order discontinuous Galerkin schemes
    Duerrwaechter, Jakob
    Kurz, Marius
    Kopper, Patrick
    Kempf, Daniel
    Munz, Claus-Dieter
    Beck, Andrea
    COMPUTERS & FLUIDS, 2021, 217
  • [7] A high-order conservative remap for discontinuous Galerkin schemes on curvilinear polygonal meshes
    Lipnikov, Konstantin
    Morgan, Nathaniel
    JOURNAL OF COMPUTATIONAL PHYSICS, 2019, 399
  • [8] Arbitrary high order discontinuous Galerkin schemes
    Dumbser, M
    Munz, CD
    NUMERICAL METHODS FOR HYPERBOLIC AND KINETIC PROBLEMS, 2005, 7 : 295 - 333
  • [9] Limiters for high-order discontinuous Galerkin methods
    Krivodonova, Lilia
    JOURNAL OF COMPUTATIONAL PHYSICS, 2007, 226 (01) : 879 - 896
  • [10] High-order implicit discontinuous Galerkin schemes for unsteady compressible Navier–Stokes equations
    Jiang Zhenhua
    Yan Chao
    Yu Jian
    Chinese Journal of Aeronautics, 2014, 27 (06) : 1384 - 1389