Cache-Aware Matrix Polynomials

被引:2
|
作者
Huber, Dominik [1 ]
Schreiber, Martin [1 ]
Yang, Dai [2 ]
Schulz, Martin [1 ]
机构
[1] Tech Univ Munich, Dept Informat, Munich, Germany
[2] NVIDIA, Munich, Germany
来源
关键词
Cache-blocking in time dimension; Matrix exponentiation; Higher-order time integration; STENCIL COMPUTATIONS; BLOCKING; CORE;
D O I
10.1007/978-3-030-50371-0_10
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Efficient solvers for partial differential equations are among the most important areas of algorithmic research in high-performance computing. In this paper we present a new optimization for solving linear autonomous partial differential equations. Our approach is based on polynomial approximations for exponential time integration, which involves the computation of matrix polynomial terms (M(p)v) in every time step. This operation is very memory intensive and requires targeted optimizations. In our approach, we exploit the cache-hierarchy of modern computer architectures using a temporal cache blocking approach over the matrix polynomial terms. We develop two single-core implementations realizing cache blocking over several sparse matrix-vector multiplications of the polynomial approximation and compare it to a reference method that performs the computation in the traditional iterative way. We evaluate our approach on three different hardware platforms and for a wide range of different matrices and demonstrate that our approach achieves time savings of up to 50% for a large number of matrices. This is especially the case on platforms with large caches, significantly increasing the performance to solve linear autonomous differential equations.
引用
收藏
页码:132 / 146
页数:15
相关论文
共 50 条
  • [1] A Cache-Aware Data Structure for Representing Boolean Polynomials
    Castro Campos, R. A.
    Sagols Troncoso, F. D.
    Zaragoza Martinez, F. J.
    [J]. 2015 12TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, COMPUTING SCIENCE AND AUTOMATIC CONTROL (CCE 2015), 2015,
  • [2] Cache-aware Sparse Matrix Formats for Kepler GPU
    Nagasaka, Yusuke
    Nukada, Akira
    Matsuoka, Satoshi
    [J]. 2014 20TH IEEE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2014, : 281 - 288
  • [3] Cache-Aware Source Coding
    Hanna, Osama A.
    Nafie, Mohammed
    El-Keyi, Amr
    [J]. IEEE COMMUNICATIONS LETTERS, 2018, 22 (06) : 1144 - 1147
  • [4] Cache-aware and cache-oblivious adaptive sorting
    Brodal, GS
    Fagerberg, R
    Moruz, G
    [J]. AUTOMATA, LANGUAGES AND PROGRAMMING, PROCEEDINGS, 2005, 3580 : 576 - 588
  • [5] Cache-aware algorithm for multidimensional correlations
    Altman, E. A.
    Vaseeva, T. V.
    Aleksandrov, A., V
    [J]. MECHANICAL SCIENCE AND TECHNOLOGY UPDATE (MSTU 2019), 2019, 1260
  • [6] CAGE: Cache-Aware Graphlet Enumeration
    Conte, Alessio
    Grossi, Roberto
    Rucci, Davide
    [J]. STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2023, 2023, 14240 : 129 - 142
  • [7] Cache-aware optimization of BAN applications
    Lei Ju
    Yun Liang
    Samarjit Chakraborty
    Tulika Mitra
    Abhik Roychoudhury
    [J]. Design Automation for Embedded Systems, 2009, 13 : 159 - 178
  • [8] Cache-Aware Iteration Space Partitioning
    Kejariwal, Arun
    Nicolau, Alexandru
    Banerjee, Utpal
    Veidenbaum, Alexander V.
    Polychronopoulos, Constantine D.
    [J]. PPOPP'08: PROCEEDINGS OF THE 2008 ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2008, : 269 - 270
  • [9] Cache-aware optimization of BAN applications
    Ju, Lei
    Liang, Yun
    Chakraborty, Samarjit
    Mitra, Tulika
    Roychoudhury, Abhik
    [J]. DESIGN AUTOMATION FOR EMBEDDED SYSTEMS, 2009, 13 (03) : 159 - 178
  • [10] Cache-aware scratchpad allocation algorithm
    Verma, M
    Wehmeyer, L
    Marwedel, P
    [J]. DESIGN, AUTOMATION AND TEST IN EUROPE CONFERENCE AND EXHIBITION, VOLS 1 AND 2, PROCEEDINGS, 2004, : 1264 - 1269