Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs

被引：10

作者：

Yoshizawa, Hiroki ^{[1
]}

Takahashi, Daisuke ^{[2
]}

机构：

[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058573, Japan

[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki 3058573, Japan

来源：

15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012) | 2012年

基金：

日本科学技术振兴机构;

关键词：

SpMV; CRS; CG; GPGPU; CUDA;

D O I：

10.1109/ICCSE.2012.28

中图分类号：

TP301 [理论、方法];

学科分类号：

081202 ;

摘要：

Performance of sparse matrix-vector multiplication (SpMV) on GPUs is highly dependent on the structure of the sparse matrix used in the computation, the computing environment, and the selection of certain parameters. In this paper, we show that the performance achieved using kernel SpMV on GPUs for the compressed row storage (CRS) format depends greatly on optimal selection of a parameter, and we propose an efficient algorithm for the automatic selection of the optimal parameter. Kernel SpMV for the CRS format using automatic parameter selection achieves up to approximately 26% improvement over NVIDIA's CUSPARSE library. The conjugate gradient method is the most popular iterative method for solving sparse systems of linear equations. Kernel SpMV makes up the bulk of the conjugate gradient method calculations. By optimizing SpMV using our approach, the conjugate gradient method performs up to approximately 10% better than CULA Sparse.

引用

页码：130 / 136

页数：7

共 50 条

[41] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
Tao, Yuan
Deng, Yangdong
Mu, Shuai
Zhang, Zhenzhong
Zhu, Mingfa
Xiao, Limin
Ruan, Li
CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
[42] Vector ISA extension for sparse matrix-vector multiplication
Vassiliadis, S
Cotofana, S
Stathis, P
EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
[43] Understanding the performance of sparse matrix-vector multiplication
Goumas, Georgios
Kourtis, Kornilios
Anastopoulos, Nikos
Karakasis, Vasileios
Koziris, Nectarios
PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +
[44] Sparse matrix-vector multiplication design on FPGAs
Sun, Junqing
Peterson, Gregory
Storaasli, Olaf
FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
[45] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
DuBois, David
DuBois, Andrew
Connor, Carolyn
Poole, Steve
PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +
[46] Node aware sparse matrix-vector multiplication
Bienz, Amanda
Gropp, William D.
Olson, Luke N.
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 130 : 166 - 178
[47] STRUCTURED SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASPAR
DEHN, T
EIERMANN, M
GIEBERMANN, K
SPERLING, V
ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1994, 74 (06): : T534 - T538
[48] VBSF: a new storage format for SIMD sparse matrix-vector multiplication on modern processors
Li, Yishui
Xie, Peizhen
Chen, Xinhai
Liu, Jie
Yang, Bo
Li, Shengguo
Gong, Chunye
Gan, Xinbiao
Xu, Han
JOURNAL OF SUPERCOMPUTING, 2020, 76 (03): : 2063 - 2081
[49] Performance Aspects of Sparse Matrix-Vector Multiplication
Simecek, I.
ACTA POLYTECHNICA, 2006, 46 (03) : 3 - 8
[50] CSR&RV: An Efficient Value Compression Format for Sparse Matrix-Vector Multiplication
Yan, Junjun
Chen, Xinhai
Liu, Jie
NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 54 - 60

← 1 2 3 4 5 →