A MEMORY EFFICIENT AND FAST SPARSE MATRIX VECTOR PRODUCT ON A GPU

被引:56
|
作者
Dziekonski, A. [1 ]
Lamecki, A. [1 ]
Mrozowski, M. [1 ]
机构
[1] Gdansk Univ Technol GUT, Fac Elect Telecommun & Informat ETI, WiComm Ctr Excellence, PL-80233 Gdansk, Poland
来源
PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER | 2011年 / 116卷
关键词
FINITE-ELEMENT-METHOD; FDTD METHOD; SCATTERING; ALGORITHM; UNITS;
D O I
10.2528/PIER11031607
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.
引用
收藏
页码:49 / 63
页数:15
相关论文
共 50 条
  • [41] Adaptive sparse matrix representation for efficient matrix-vector multiplication
    Zardoshti, Pantea
    Khunjush, Farshad
    Sarbazi-Azad, Hamid
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (09): : 3366 - 3386
  • [42] Shuffle Reduction Based Sparse Matrix-Vector Multiplication on Kepler GPU
    Yuan Tao
    Huang Zhi-Bin
    INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (10): : 99 - 106
  • [43] A hybrid format for better performance of sparse matrix-vector multiplication on a GPU
    Guo, Dahai
    Gropp, William
    Olson, Luke N.
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2016, 30 (01): : 103 - 120
  • [44] Learning Sparse Matrix Row Permutations for Efficient SpMM on GPU Architectures
    Mehrabi, Atefeh
    Lee, Donghyuk
    Chatterjee, Niladrish
    Sorin, Daniel J.
    Lee, Benjamin C.
    O'Connor, Mike
    2021 IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE (ISPASS 2021), 2021, : 48 - 58
  • [45] GPU-Accelerated Sparse Matrix Vector Product based on Element-by-Element Method for Unstructured FEM using OpenACC
    Kusakabe, Ryota
    Fujita, Kohei
    Ichimura, Tsuyoshi
    Hori, Muneo
    Lalith, Maddegedara
    2022 WORKSHOP ON ACCELERATOR PROGRAMMING USING DIRECTIVES (WACCPD), 2022, : 52 - 61
  • [46] Exploiting dense substructures for fast sparse matrix vector multiplication
    Shantharam, Manu
    Chatterjee, Anirban
    Raghavan, Padma
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2011, 25 (03): : 328 - 341
  • [47] The fast reduced QMC matrix-vector product
    Dick, Josef
    Ebert, Adrian
    Herrmann, Lukas
    Kritzer, Peter
    Longo, Marcello
    JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS, 2024, 440
  • [48] Efficient spectral integral-based scheme for fast matrix-vector product algorithm
    Valero-Nogueira, A
    Rojas, RG
    1999 SBMO/IEEE MTT-S INTERNATIONAL MICROWAVE AND OPTOELECTRONICS CONFERENCE, PROCEEDINGS, VOLS 1 & 2: WIRELESS AND PHOTONICS BUILDING THE GLOBAL INFOWAYS, 1999, : 389 - 391
  • [49] Efficient Full-Chip Statistical Leakage Analysis Based on Fast Matrix Vector Product
    Gao, Mingzhi
    Ye, Zuochang
    Wang, Yan
    Yu, Zhiping
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2012, 31 (03) : 356 - 369
  • [50] From a Sparse Vector to a Sparse Symmetric Matrix for Efficient Lossy Speech Compression
    Omara, A. N.
    Hefnawy, A. A.
    Zekry, Abdelhalim
    2017 INTL CONF ON ADVANCED CONTROL CIRCUITS SYSTEMS (ACCS) SYSTEMS & 2017 INTL CONF ON NEW PARADIGMS IN ELECTRONICS & INFORMATION TECHNOLOGY (PEIT), 2017, : 98 - 104