Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU

被引:3
|
作者
Gao, Jiaquan [1 ]
Qi, Panpan [2 ]
He, Guixia [3 ]
机构
[1] Nanjing Normal Univ, Sch Comp Sci & Technol, Nanjing 210023, Jiangsu, Peoples R China
[2] Zhejiang Univ Technol, Coll Comp Sci & Technol, Hangzhou 310023, Zhejiang, Peoples R China
[3] Zhejiang Univ Technol, Zhijiang Coll, Hangzhou 310024, Zhejiang, Peoples R China
关键词
FORMAT; PERFORMANCE;
D O I
10.1155/2016/4596943
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Sparse matrix-vector multiplication (SpMV) is an important operation in computational science and needs be accelerated because it often represents the dominant cost in many widely used iterative methods and eigenvalue problems. We achieve this objective by proposing a novel SpMV algorithm based on the compressed sparse row (CSR) on the GPU. Our method dynamically assigns different numbers of rows to each thread block and executes different optimization implementations on the basis of the number of rows it involves for each block. The process of accesses to the CSR arrays is fully coalesced, and the GPU's DRAM bandwidth is efficiently utilized by loading data into the shared memory, which alleviates the bottleneck of many existing CSR-based algorithms (i.e., CSR-scalar and CSR-vector). Test results on C2050 and K20c GPUs show that our method outperforms a perfect-CSR algorithm that inspires our work, the vendor tuned CUSPARSE V6.5 and CUSP V0.5.1, and three popular algorithms clSpMV, CSR5, and CSR-Adaptive.
引用
收藏
页数:14
相关论文
共 50 条
  • [31] Optimization of Sparse Matrix-Vector Multiplication by Auto Selecting Storage Schemes on GPU
    Kubota, Yuji
    Takahashi, Daisuke
    COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2011, PT II, 2011, 6783 : 547 - 561
  • [32] Performance Evaluation of Sparse Matrix-Vector Multiplication Using GPU/MIC Cluster
    Maeda, Hiroshi
    Takahashi, Daisuke
    PROCEEDINGS OF 2015 THIRD INTERNATIONAL SYMPOSIUM ON COMPUTING AND NETWORKING (CANDAR), 2015, : 396 - 399
  • [33] TaiChi: A Hybrid Compression Format for Binary Sparse Matrix-Vector Multiplication on GPU
    Gao, Jianhua
    Ji, Weixing
    Tan, Zhaonian
    Wang, Yizhuo
    Shi, Feng
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2022, 33 (12) : 3732 - 3745
  • [34] Efficient Sparse Matrix-Vector Multiplication on GPUs using the CSR Format, Pinned Memory and Overlap Data Transfer
    Huillcen Baca, Herwin Alayn
    Palomino Valdivia, Flor de Luz
    PROCEEDINGS OF THE 2019 IEEE XXVI INTERNATIONAL CONFERENCE ON ELECTRONICS, ELECTRICAL ENGINEERING AND COMPUTING (INTERCON), 2019,
  • [35] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [36] SpDRAM: Efficient In-DRAM Acceleration of Sparse Matrix-Vector Multiplication
    Kang, Jieui
    Choi, Soeun
    Lee, Eunjin
    Sim, Jaehyeong
    IEEE ACCESS, 2024, 12 : 176009 - 176021
  • [37] An Efficient Sparse Matrix-Vector Multiplication on Distributed Memory Parallel Computers
    Shahnaz, Rukhsana
    Usman, Anila
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (01): : 77 - 82
  • [38] Understanding the performance of sparse matrix-vector multiplication
    Goumas, Georgios
    Kourtis, Kornilios
    Anastopoulos, Nikos
    Karakasis, Vasileios
    Koziris, Nectarios
    PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +
  • [39] Sparse matrix-vector multiplication design on FPGAs
    Sun, Junqing
    Peterson, Gregory
    Storaasli, Olaf
    FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
  • [40] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
    DuBois, David
    DuBois, Andrew
    Connor, Carolyn
    Poole, Steve
    PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +