Design patterns for sparse-matrix computations on hybrid CPU/GPU platforms

被引：8

作者：

Cardellini, Valeria ^{[1
]}

Filippone, Salvatore ^{[1
]}

Rouson, Damian W. I. ^{[2
]}

机构：

[1] Univ Roma Tor Vergata, Rome, Italy

[2] Stanford Univ, Stanford, CA 94305 USA

来源：

SCIENTIFIC PROGRAMMING | 2014年 / 22卷 / 01期

关键词：

Design patterns; sparse matrices; GPGPU computing; MODEL;

D O I：

10.3233/SPR-130363

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

We apply object-oriented software design patterns to develop code for scientific software involving sparse matrices. Design patterns arise when multiple independent developments produce similar designs which converge onto a generic solution. We demonstrate how to use design patterns to implement an interface for sparse matrix computations on NVIDIA GPUs starting from PSBLAS, an existing sparse matrix library, and from existing sets of GPU kernels for sparse matrices. We also compare the throughput of the PSBLAS sparse matrix-vector multiplication on two platforms exploiting the GPU with that obtained by a CPU-only PSBLAS implementation. Our experiments exhibit encouraging results regarding the comparison between CPU and GPU executions in double precision, obtaining a speedup of up to 35.35 on NVIDIA GTX 285 with respect to AMD Athlon 7750, and up to 10.15 on NVIDIA Tesla C2050 with respect to Intel Xeon X5650.

引用

下载

页码：1 / 19

页数：19

共 50 条

[41] Design Patterns for High-Performance Matrix Computations
Son, Hoang M.
MODELING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2008, : 509 - 519
[42] Dynamic Load Balancing for High-Performance Graph Processing on Hybrid CPU-GPU Platforms
Heldens, Stijn
Varbanescu, Ana Lucia
Iosup, Alexandru
PROCEEDINGS OF 2016 6TH WORKSHOP ON IRREGULAR APPLICATIONS: ARCHITECTURE AND ALGORITHMS (IA3), 2016, : 62 - 65
[43] An Autotuning Engine for the 3D Fast Wavelet Transform on Clusters with Hybrid CPU + GPU Platforms
Gregorio Bernabé
Javier Cuenca
Domingo Giménez
International Journal of Parallel Programming, 2015, 43 : 1160 - 1191
[44] Cache performance optimization of irregular sparse matrix multiplication on modern multi-core CPU and GPU
刘力
LiuLi
Yang Guang wen
High Technology Letters, 2013, 19 (04) : 339 - 345
[45] A hybrid format for better performance of sparse matrix-vector multiplication on a GPU
Guo, Dahai
Gropp, William
Olson, Luke N.
INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2016, 30 (01): : 103 - 120
[46] A Hybrid B plus -tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms
Shahvarani, Amirhesam
Jacobsen, Hans-Arno
SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1523 - 1538
[47] An Autotuning Engine for the 3D Fast Wavelet Transform on Clusters with Hybrid CPU plus GPU Platforms
Bernabe, Gregorio
Cuenca, Javier
Gimenez, Domingo
INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2015, 43 (06) : 1160 - 1191
[48] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
Hu, Yichang
Lu, Lu
JOURNAL OF SUPERCOMPUTING, 2021, 77 (12): : 13739 - 13756
[49] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
Yichang Hu
Lu Lu
The Journal of Supercomputing, 2021, 77 : 13739 - 13756
[50] Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters
Agarwal, Tejaswi
Becchi, Michela
PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 505 - 506

← 1 2 3 4 5 →