Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA

被引：7

作者：

Xu, Shiming ^{[1
]}

Lin, Hai Xiang ^{[1
]}

Xue, Wei ^{[2
]}

机构：

[1] Delft Univ Technol, Delft Inst Appl Math, Delft, Netherlands

[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE NINTH INTERNATIONAL SYMPOSIUM ON DISTRIBUTED COMPUTING AND APPLICATIONS TO BUSINESS, ENGINEERING AND SCIENCE (DCABES 2010) | 2010年

关键词：

SpMV; GP-GPU; NVIDIA CUDA; RCM;

D O I：

10.1109/DCABES.2010.162

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

In this paper we propose the optimization of sparse matrix-vector multiplication (SpMV) with CUDA based on matrix bandwidth/profile reduction techniques. Computational time required to access dense vector is decoupled from SpMV computation. By reducing the matrix profile, the time required to access dense vector is reduced by 17% (for SP) and 24% (for DP). Reduced matrix bandwidth enables column index information compression with shorter formats, resulting in a 17% (for SP) and 10% (for DP) execution time reduction for accessing matrix data under ELLPACK format. The overall speedup for SpMV is 16% and 12.6% for the whole matrix test suite. The optimization proposed in this paper can be combined with other SpMV optimizations such as register blocking.

引用

下载

页码：609 / 614

页数：6

共 50 条

[41] Load-balancing in sparse matrix-vector multiplication
Nastea, SG
Frieder, O
ElGhazawi, T
EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 218 - 225
[42] Optimization by Runtime Specialization for Sparse Matrix-Vector Multiplication
Kamin, Sam
Garzaran, Maria Jesus
Aktemur, Baris
Xu, Danqing
Yilmaz, Buse
Chen, Zhongbo
ACM SIGPLAN NOTICES, 2015, 50 (03) : 93 - 102
[43] A New Method of Sparse Matrix-Vector Multiplication on GPU
Huan, Gao
Qian, Zhang
PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 954 - 958
[44] A new approach for accelerating the sparse matrix-vector multiplication
Tvrdik, Pavel
Simecek, Ivan
SYNASC 2006: EIGHTH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING, PROCEEDINGS, 2007, : 156 - +
[45] Adaptive diagonal sparse matrix-vector multiplication on GPU
Gao, Jiaquan
Xia, Yifei
Yin, Renjie
He, Guixia
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2021, 157 : 287 - 302
[46] No Zero Padded Sparse Matrix-Vector Multiplication on FPGAs
Huang, Jiasen
Ren, Junyan
Yin, Wenbo
Wang, Lingli
PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (FPT), 2014, : 290 - 291
[47] Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers
Schuman, Catherine D.
Kay, Bill
Date, Prasanna
Kannan, Ramakrishnan
Sao, Piyush
Potok, Thomas E.
2021 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2021, : 308 - 311
[48] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer with Application
Dubois, David
Dubois, Andrew
Boorman, Thomas
Connor, Carolyn
Poole, Steve
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2010, 3 (01)
[49] Optimization techniques for sparse matrix-vector multiplication on GPUs
Maggioni, Marco
Berger-Wolf, Tanya
JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2016, 93-94 : 66 - 86
[50] Processor-efficient sparse matrix-vector multiplication
Heath, LS
Ribbens, CJ
Pemmaraju, SV
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2004, 48 (3-4) : 589 - 608

← 1 2 3 4 5 →