A MEMORY EFFICIENT AND FAST SPARSE MATRIX VECTOR PRODUCT ON A GPU

被引:56
|
作者
Dziekonski, A. [1 ]
Lamecki, A. [1 ]
Mrozowski, M. [1 ]
机构
[1] Gdansk Univ Technol GUT, Fac Elect Telecommun & Informat ETI, WiComm Ctr Excellence, PL-80233 Gdansk, Poland
关键词
FINITE-ELEMENT-METHOD; FDTD METHOD; SCATTERING; ALGORITHM; UNITS;
D O I
10.2528/PIER11031607
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.
引用
收藏
页码:49 / 63
页数:15
相关论文
共 50 条
  • [31] An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data
    Liu, Weifeng
    Vinter, Brian
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [32] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
    Berger, Gonzalo
    Dufrechou, Ernesto
    Ezzatti, Pablo
    EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
  • [33] FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs
    Ortega, Gloria
    Vazquez, Francisco
    Garcia, Inmaculada
    Garzon, Ester M.
    COMPUTER JOURNAL, 2014, 57 (07): : 968 - 979
  • [34] Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication
    Fujiki, Daichi
    Chatterjee, Niladrish
    Lee, Donghyuk
    O'Connor, Mike
    PROCEEDINGS OF SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2019,
  • [35] SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures
    Giannoula, Christina
    Fernandez, Ivan
    Luna, Juan Gomez
    Koziris, Nectarios
    Goumas, Georgios
    Mutlu, Onur
    PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (01)
  • [36] Towards Efficient Algorithms for Compressed Sparse-Sparse Matrix Product
    Ezouaoui, Sana
    Hamdi-Larbi, Olfa
    Mahjoub, Zaher
    2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 651 - 658
  • [37] Efficient ordering algorithms for sparse matrix/vector methods
    Gooi, HB
    Wang, YQ
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 1998, 20 (01) : 53 - 59
  • [38] Efficient ordering algorithms for sparse matrix/vector methods
    Gooi, H.B.
    Wang, Y.Q.
    International Journal of Electrical Power and Energy System, 1998, 20 (01): : 53 - 59
  • [39] Hardware Support for Efficient Sparse Matrix Vector Multiplication
    Ku, Anderson Kuei-An
    Kuo, Jenny Yi-Chun
    Xue, Jingling
    EUC 2008: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, VOL 1, MAIN CONFERENCE, 2008, : 37 - 43
  • [40] A new approach for sparse matrix vector product on NVIDIA GPUs
    Vazquez, F.
    Fernandez, J. J.
    Garzon, E. M.
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (08): : 815 - 826