Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS format on GPUs

被引:10
|
作者
Yoshizawa, Hiroki [1 ]
Takahashi, Daisuke [2 ]
机构
[1] Univ Tsukuba, Grad Sch Syst & Informat Engn, 1-1-1 Tennodai, Tsukuba, Ibaraki 3058573, Japan
[2] Univ Tsukuba, Fac Engn Informat & Syst, Tsukuba, Ibaraki 3058573, Japan
来源
15TH IEEE INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2012) / 10TH IEEE/IFIP INTERNATIONAL CONFERENCE ON EMBEDDED AND UBIQUITOUS COMPUTING (EUC 2012) | 2012年
基金
日本科学技术振兴机构;
关键词
SpMV; CRS; CG; GPGPU; CUDA;
D O I
10.1109/ICCSE.2012.28
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Performance of sparse matrix-vector multiplication (SpMV) on GPUs is highly dependent on the structure of the sparse matrix used in the computation, the computing environment, and the selection of certain parameters. In this paper, we show that the performance achieved using kernel SpMV on GPUs for the compressed row storage (CRS) format depends greatly on optimal selection of a parameter, and we propose an efficient algorithm for the automatic selection of the optimal parameter. Kernel SpMV for the CRS format using automatic parameter selection achieves up to approximately 26% improvement over NVIDIA's CUSPARSE library. The conjugate gradient method is the most popular iterative method for solving sparse systems of linear equations. Kernel SpMV makes up the bulk of the conjugate gradient method calculations. By optimizing SpMV using our approach, the conjugate gradient method performs up to approximately 10% better than CULA Sparse.
引用
收藏
页码:130 / 136
页数:7
相关论文
共 50 条
  • [41] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
    Tao, Yuan
    Deng, Yangdong
    Mu, Shuai
    Zhang, Zhenzhong
    Zhu, Mingfa
    Xiao, Limin
    Ruan, Li
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
  • [42] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [43] Understanding the performance of sparse matrix-vector multiplication
    Goumas, Georgios
    Kourtis, Kornilios
    Anastopoulos, Nikos
    Karakasis, Vasileios
    Koziris, Nectarios
    PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +
  • [44] Sparse matrix-vector multiplication design on FPGAs
    Sun, Junqing
    Peterson, Gregory
    Storaasli, Olaf
    FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
  • [45] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
    DuBois, David
    DuBois, Andrew
    Connor, Carolyn
    Poole, Steve
    PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +
  • [46] Node aware sparse matrix-vector multiplication
    Bienz, Amanda
    Gropp, William D.
    Olson, Luke N.
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 130 : 166 - 178
  • [47] STRUCTURED SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASPAR
    DEHN, T
    EIERMANN, M
    GIEBERMANN, K
    SPERLING, V
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1994, 74 (06): : T534 - T538
  • [48] VBSF: a new storage format for SIMD sparse matrix-vector multiplication on modern processors
    Li, Yishui
    Xie, Peizhen
    Chen, Xinhai
    Liu, Jie
    Yang, Bo
    Li, Shengguo
    Gong, Chunye
    Gan, Xinbiao
    Xu, Han
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (03): : 2063 - 2081
  • [49] Performance Aspects of Sparse Matrix-Vector Multiplication
    Simecek, I.
    ACTA POLYTECHNICA, 2006, 46 (03) : 3 - 8
  • [50] CSR&RV: An Efficient Value Compression Format for Sparse Matrix-Vector Multiplication
    Yan, Junjun
    Chen, Xinhai
    Liu, Jie
    NETWORK AND PARALLEL COMPUTING, NPC 2022, 2022, 13615 : 54 - 60