A MEMORY EFFICIENT AND FAST SPARSE MATRIX VECTOR PRODUCT ON A GPU

被引：56

作者：

Dziekonski, A. ^{[1
]}

Lamecki, A. ^{[1
]}

Mrozowski, M. ^{[1
]}

机构：

[1] Gdansk Univ Technol GUT, Fac Elect Telecommun & Informat ETI, WiComm Ctr Excellence, PL-80233 Gdansk, Poland

来源：

PROGRESS IN ELECTROMAGNETICS RESEARCH-PIER | 2011年 / 116卷

关键词：

FINITE-ELEMENT-METHOD; FDTD METHOD; SCATTERING; ALGORITHM; UNITS;

D O I：

10.2528/PIER11031607

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper proposes a new sparse matrix storage format which allows an efficient implementation of a sparse matrix vector product on a Fermi Graphics Processing Unit (GPU). Unlike previous formats it has both low memory footprint and good throughput. The new format, which we call Sliced ELLR-T has been designed specifically for accelerating the iterative solution of a large sparse and complex-valued system of linear equations arising in computational electromagnetics. Numerical tests have shown that the performance of the new implementation reaches 69 GFLOPS in complex single precision arithmetic. Compared to the optimized six core Central Processing Unit (CPU) (Intel Xeon 5680) this performance implies a speedup by a factor of six. In terms of speed the new format is as fast as the best format published so far and at the same time it does not introduce redundant zero elements which have to be stored to ensure fast memory access. Compared to previously published solutions, significantly larger problems can be handled using low cost commodity GPUs with limited amount of on-board memory.

引用

页码：49 / 63

页数：15

共 50 条

[31] An Efficient GPU General Sparse Matrix-Matrix Multiplication for Irregular Data
Liu, Weifeng
Vinter, Brian
2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
[32] Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUs
Berger, Gonzalo
Dufrechou, Ernesto
Ezzatti, Pablo
EURO-PAR 2023: PARALLEL PROCESSING WORKSHOPS, PT I, EURO-PAR 2023, 2024, 14351 : 246 - 256
[33] FastSpMM: An Efficient Library for Sparse Matrix Matrix Product on GPUs
Ortega, Gloria
Vazquez, Francisco
Garcia, Inmaculada
Garzon, Ester M.
COMPUTER JOURNAL, 2014, 57 (07): : 968 - 979
[34] Near-Memory Data Transformation for Efficient Sparse Matrix Multi-Vector Multiplication
Fujiki, Daichi
Chatterjee, Niladrish
Lee, Donghyuk
O'Connor, Mike
PROCEEDINGS OF SC19: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2019,
[35] SparseP: Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures
Giannoula, Christina
Fernandez, Ivan
Luna, Juan Gomez
Koziris, Nectarios
Goumas, Georgios
Mutlu, Onur
PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS, 2022, 6 (01)
[36] Towards Efficient Algorithms for Compressed Sparse-Sparse Matrix Product
Ezouaoui, Sana
Hamdi-Larbi, Olfa
Mahjoub, Zaher
2017 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS), 2017, : 651 - 658
[37] Efficient ordering algorithms for sparse matrix/vector methods
Gooi, HB
Wang, YQ
INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 1998, 20 (01) : 53 - 59
[38] Efficient ordering algorithms for sparse matrix/vector methods
Gooi, H.B.
Wang, Y.Q.
International Journal of Electrical Power and Energy System, 1998, 20 (01): : 53 - 59
[39] Hardware Support for Efficient Sparse Matrix Vector Multiplication
Ku, Anderson Kuei-An
Kuo, Jenny Yi-Chun
Xue, Jingling
EUC 2008: PROCEEDINGS OF THE 5TH INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING, VOL 1, MAIN CONFERENCE, 2008, : 37 - 43
[40] A new approach for sparse matrix vector product on NVIDIA GPUs
Vazquez, F.
Fernandez, J. J.
Garzon, E. M.
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2011, 23 (08): : 815 - 826

← 1 2 3 4 5 →