Adaptive diagonal sparse matrix-vector multiplication on GPU

被引:4
|
作者
Gao, Jiaquan [1 ]
Xia, Yifei [1 ]
Yin, Renjie [1 ]
He, Guixia [2 ]
机构
[1] Nanjing Normal Univ, Sch Comp & Elect Informat, Jiangsu Key Lab NSLSCS, Nanjing 210023, Peoples R China
[2] Zhejiang Univ Technol, Zhijiang Coll, Hangzhou 310024, Peoples R China
基金
中国国家自然科学基金;
关键词
Diagonal sparse matrices; Sparse matrix-vector multiplication; Sparse storage format; CUDA; GPU; OPTIMIZATION; FORMAT; SPMV;
D O I
10.1016/j.jpdc.2021.07.007
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
For diagonal sparse matrices that have many long zero sections or scatter points or diagonal deviations from the main diagonal, a great number of zeros need be filled to maintain the diagonal structure while using DIA to store them, which leads to the performance degradation of the existing DIA kernels because the padded zeros consume extra computation and memory resources. This motivates us to present an adaptive sparse matrix-vector multiplication (SpMV) for diagonal sparse matrices on the graphics processing unit (GPU), called DIA-Adaptive, to alleviate the drawback of DIA kernels for these cases. For DIA-Adaptive, there are the following characteristics: (1) two new sparse storage formats, BRCSD (Diagonal Compressed Storage based on Row-Blocks)-I and BRCSD-II, are proposed to adapt it to various types of diagonal sparse matrices besides adopting DIA, and SpMV kernels corresponding to these storage formats are presented; and (2) a search engine is designed to choose the most appropriate storage format from DIA, BRCSD-I, and BRCSD-II for any given diagonal sparse matrix; and (3) a code generator is presented to automatically generate SpMV kernels. Using DIA-Adaptive, the ideal storage format and kernel are automatically chosen for any given diagonal sparse matrix, and thus high performance is achieved. Experimental results show that our proposed DIA-Adaptive is effective, and has high performance and good parallelism, and outperforms the state-of-the-art SpMV algorithms for all test cases. (C) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页码:287 / 302
页数:16
相关论文
共 50 条
  • [1] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
    Tao, Yuan
    Deng, Yangdong
    Mu, Shuai
    Zhang, Zhenzhong
    Zhu, Mingfa
    Xiao, Limin
    Ruan, Li
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
  • [2] Implementing Sparse Matrix-Vector Multiplication with QCSR on GPU
    Zhang, Jilin
    Liu, Enyi
    Wan, Jian
    Ren, Yongjian
    Yue, Miao
    Wang, Jue
    [J]. APPLIED MATHEMATICS & INFORMATION SCIENCES, 2013, 7 (02): : 473 - 482
  • [3] Energy Evaluation of Sparse Matrix-Vector Multiplication on GPU
    Benatia, Akrem
    Ji, Weixing
    Wang, Yizhuo
    Shi, Feng
    [J]. 2016 SEVENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2016,
  • [4] A New Method of Sparse Matrix-Vector Multiplication on GPU
    Huan, Gao
    Qian, Zhang
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 954 - 958
  • [5] Optimization of quasi-diagonal matrix-vector multiplication on GPU
    Yang, Wangdong
    Li, Kenli
    Liu, Yan
    Shi, Lin
    Wan, Lanjun
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2014, 28 (02): : 183 - 195
  • [6] Automatically Tuning Sparse Matrix-Vector Multiplication for GPU Architectures
    Monakov, Alexander
    Lokhmotov, Anton
    Avetisyan, Arutyun
    [J]. HIGH PERFORMANCE EMBEDDED ARCHITECTURES AND COMPILERS, PROCEEDINGS, 2010, 5952 : 111 - +
  • [7] Adaptive sparse matrix representation for efficient matrix-vector multiplication
    Zardoshti, Pantea
    Khunjush, Farshad
    Sarbazi-Azad, Hamid
    [J]. JOURNAL OF SUPERCOMPUTING, 2016, 72 (09): : 3366 - 3386
  • [8] Reducing Vector I/O for Faster GPU Sparse Matrix-Vector Multiplication
    Nguyen Quang Anh Pham
    Fan, Rui
    Wen, Yonggang
    [J]. 2015 IEEE 29TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2015, : 1043 - 1052
  • [9] Shuffle Reduction Based Sparse Matrix-Vector Multiplication on Kepler GPU
    Yuan Tao
    Huang Zhi-Bin
    [J]. INTERNATIONAL JOURNAL OF GRID AND DISTRIBUTED COMPUTING, 2016, 9 (10): : 99 - 106
  • [10] A hybrid format for better performance of sparse matrix-vector multiplication on a GPU
    Guo, Dahai
    Gropp, William
    Olson, Luke N.
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2016, 30 (01): : 103 - 120