Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA

被引:7
|
作者
Xu, Shiming [1 ]
Lin, Hai Xiang [1 ]
Xue, Wei [2 ]
机构
[1] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
关键词
SpMV; GP-GPU; NVIDIA CUDA; RCM;
D O I
10.1109/DCABES.2010.162
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we propose the optimization of sparse matrix-vector multiplication (SpMV) with CUDA based on matrix bandwidth/profile reduction techniques. Computational time required to access dense vector is decoupled from SpMV computation. By reducing the matrix profile, the time required to access dense vector is reduced by 17% (for SP) and 24% (for DP). Reduced matrix bandwidth enables column index information compression with shorter formats, resulting in a 17% (for SP) and 10% (for DP) execution time reduction for accessing matrix data under ELLPACK format. The overall speedup for SpMV is 16% and 12.6% for the whole matrix test suite. The optimization proposed in this paper can be combined with other SpMV optimizations such as register blocking.
引用
下载
收藏
页码:609 / 614
页数:6
相关论文
共 50 条
  • [21] Understanding the performance of sparse matrix-vector multiplication
    Goumas, Georgios
    Kourtis, Kornilios
    Anastopoulos, Nikos
    Karakasis, Vasileios
    Koziris, Nectarios
    PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +
  • [22] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
    DuBois, David
    DuBois, Andrew
    Connor, Carolyn
    Poole, Steve
    PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +
  • [23] Sparse matrix-vector multiplication design on FPGAs
    Sun, Junqing
    Peterson, Gregory
    Storaasli, Olaf
    FCCM 2007: 15TH ANNUAL IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, PROCEEDINGS, 2007, : 349 - +
  • [24] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [25] STRUCTURED SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASPAR
    DEHN, T
    EIERMANN, M
    GIEBERMANN, K
    SPERLING, V
    ZEITSCHRIFT FUR ANGEWANDTE MATHEMATIK UND MECHANIK, 1994, 74 (06): : T534 - T538
  • [26] Adaptive sparse matrix representation for efficient matrix-vector multiplication
    Zardoshti, Pantea
    Khunjush, Farshad
    Sarbazi-Azad, Hamid
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (09): : 3366 - 3386
  • [27] Performance Aspects of Sparse Matrix-Vector Multiplication
    Simecek, I.
    ACTA POLYTECHNICA, 2006, 46 (03) : 3 - 8
  • [28] Sparse matrix-vector multiplication -: Final solution?
    Simecek, Ivan
    Tvrdik, Pavel
    PARALLEL PROCESSING AND APPLIED MATHEMATICS, 2008, 4967 : 156 - 165
  • [29] On improving the performance of sparse matrix-vector multiplication
    White, JB
    Sadayappan, P
    FOURTH INTERNATIONAL CONFERENCE ON HIGH-PERFORMANCE COMPUTING, PROCEEDINGS, 1997, : 66 - 71
  • [30] LightSpMV: Faster CSR-based Sparse Matrix-Vector Multiplication on CUDA-enabled GPUs
    Liu, Yongchao
    Schmidt, Bertil
    PROCEEDINGS OF THE ASAP2015 2015 IEEE 26TH INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS, 2015, : 82 - 89