Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs

被引：10

作者：

Yoshizawa, Hiroki ^{[1
]}

Takahashi, Daisuke ^{[2
]}

机构：

[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058573, Japan

[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki 3058573, Japan

来源：

15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012) | 2012年

基金：

日本科学技术振兴机构;

关键词：

SpMV; CRS; CG; GPGPU; CUDA;

D O I：

10.1109/ICCSE.2012.28

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Performance of sparse matrix-vector multiplication (SpMV) on GPUs is highly dependent on the structure of the sparse matrix used in the computation, the computing environment, and the selection of certain parameters. In this paper, we show that the performance achieved using kernel SpMV on GPUs for the compressed row storage (CRS) format depends greatly on optimal selection of a parameter, and we propose an efficient algorithm for the automatic selection of the optimal parameter. Kernel SpMV for the CRS format using automatic parameter selection achieves up to approximately 26% improvement over NVIDIA's CUSPARSE library. The conjugate gradient method is the most popular iterative method for solving sparse systems of linear equations. Kernel SpMV makes up the bulk of the conjugate gradient method calculations. By optimizing SpMV using our approach, the conjugate gradient method performs up to approximately 10% better than CULA Sparse.

引用

页码：130 / 136

页数：7

共 50 条

[31] An architecture-aware technique for optimizing sparse matrix-vector multiplication on GPUs
Maggioni, Marco
Berger-Wolf, Tanya
2013 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2013, 18 : 329 - 338
[32] Leveraging Memory Copy Overlap for Efficient Sparse Matrix-Vector Multiplication on GPUs
Zeng, Guangsen
Zou, Yi
ELECTRONICS, 2023, 12 (17)
[33] A Performance Modeling and Optimization Analysis Tool for Sparse Matrix-Vector Multiplication on GPUs
Guo, Ping
Wang, Liqiang
Chen, Po
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2014, 25 (05) : 1112 - 1123
[34] Sparse Matrix-Vector Multiplication on GPGPUs
Filippone, Salvatore
Cardellini, Valeria
Barbieri, Davide
Fanfarillo, Alessandro
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
[35] Auto-tuning of Sparse Matrix-Vector Multiplication on Graphics Processors
Abu-Sufah, Walid
Karim, Asma Abdel
SUPERCOMPUTING (ISC 2013), 2013, 7905 : 151 - 164
[36] TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU
Gao, Jianhua
Ji, Weixing
Tan, Zhaonian
Wang, Yizhuo
Shi, Feng
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3732 - 3745
[37] CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations
Dang, Hoang-Vu
Schmidt, Bertil
PARALLEL COMPUTING, 2013, 39 (11) : 737 - 750
[38] An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs
Ashari, Arash
Sedaghati, Naser
Eisenlohr, John
Sadayappan, P.
PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON SUPERCOMPUTING, (ICS'14), 2014, : 273 - 282
[39] Performance Analysis of Sparse Matrix-Vector Multiplication (SpMV) on Graphics Processing Units (GPUs)
AlAhmadi, Sarah
Mohammed, Thaha
Albeshri, Aiiad
Katib, Iyad
Mehmood, Rashid
ELECTRONICS, 2020, 9 (10) : 1 - 30
[40] Hierarchical Matrix Operations on GPUs: Matrix-Vector Multiplication and Compression
Boukaram, Wajih
Turkiyyah, George
Keyes, David
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2019, 45 (01):

← 1 2 3 4 5 →