Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA

被引:7
|
作者
Xu, Shiming [1 ]
Lin, Hai Xiang [1 ]
Xue, Wei [2 ]
机构
[1] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
关键词
SpMV; GP-GPU; NVIDIA CUDA; RCM;
D O I
10.1109/DCABES.2010.162
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
In this paper we propose the optimization of sparse matrix-vector multiplication (SpMV) with CUDA based on matrix bandwidth/profile reduction techniques. Computational time required to access dense vector is decoupled from SpMV computation. By reducing the matrix profile, the time required to access dense vector is reduced by 17% (for SP) and 24% (for DP). Reduced matrix bandwidth enables column index information compression with shorter formats, resulting in a 17% (for SP) and 10% (for DP) execution time reduction for accessing matrix data under ELLPACK format. The overall speedup for SpMV is 16% and 12.6% for the whole matrix test suite. The optimization proposed in this paper can be combined with other SpMV optimizations such as register blocking.
引用
下载
收藏
页码:609 / 614
页数:6
相关论文
共 50 条
  • [41] Load-balancing in sparse matrix-vector multiplication
    Nastea, SG
    Frieder, O
    ElGhazawi, T
    EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 218 - 225
  • [42] Optimization by Runtime Specialization for Sparse Matrix-Vector Multiplication
    Kamin, Sam
    Garzaran, Maria Jesus
    Aktemur, Baris
    Xu, Danqing
    Yilmaz, Buse
    Chen, Zhongbo
    ACM SIGPLAN NOTICES, 2015, 50 (03) : 93 - 102
  • [43] A New Method of Sparse Matrix-Vector Multiplication on GPU
    Huan, Gao
    Qian, Zhang
    PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 954 - 958
  • [44] A new approach for accelerating the sparse matrix-vector multiplication
    Tvrdik, Pavel
    Simecek, Ivan
    SYNASC 2006: EIGHTH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PROCEEDINGS, 2007, : 156 - +
  • [45] Adaptive diagonal sparse matrix-vector multiplication on GPU
    Gao, Jiaquan
    Xia, Yifei
    Yin, Renjie
    He, Guixia
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 157 : 287 - 302
  • [46] No Zero Padded Sparse Matrix-Vector Multiplication on FPGAs
    Huang, Jiasen
    Ren, Junyan
    Yin, Wenbo
    Wang, Lingli
    PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2014, : 290 - 291
  • [47] Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers
    Schuman, Catherine D.
    Kay, Bill
    Date, Prasanna
    Kannan, Ramakrishnan
    Sao, Piyush
    Potok, Thomas E.
    2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 308 - 311
  • [48] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer with Application
    Dubois, David
    Dubois, Andrew
    Boorman, Thomas
    Connor, Carolyn
    Poole, Steve
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2010, 3 (01)
  • [49] Optimization techniques for sparse matrix-vector multiplication on GPUs
    Maggioni, Marco
    Berger-Wolf, Tanya
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 93-94 : 66 - 86
  • [50] Processor-efficient sparse matrix-vector multiplication
    Heath, LS
    Ribbens, CJ
    Pemmaraju, SV
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2004, 48 (3-4) : 589 - 608