Design patterns for sparse-matrix computations on hybrid CPU/GPU platforms

被引:8
|
作者
Cardellini, Valeria [1 ]
Filippone, Salvatore [1 ]
Rouson, Damian W. I. [2 ]
机构
[1] Univ Roma Tor Vergata, Rome, Italy
[2] Stanford Univ, Stanford, CA 94305 USA
关键词
Design patterns; sparse matrices; GPGPU computing; MODEL;
D O I
10.3233/SPR-130363
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
We apply object-oriented software design patterns to develop code for scientific software involving sparse matrices. Design patterns arise when multiple independent developments produce similar designs which converge onto a generic solution. We demonstrate how to use design patterns to implement an interface for sparse matrix computations on NVIDIA GPUs starting from PSBLAS, an existing sparse matrix library, and from existing sets of GPU kernels for sparse matrices. We also compare the throughput of the PSBLAS sparse matrix-vector multiplication on two platforms exploiting the GPU with that obtained by a CPU-only PSBLAS implementation. Our experiments exhibit encouraging results regarding the comparison between CPU and GPU executions in double precision, obtaining a speedup of up to 35.35 on NVIDIA GTX 285 with respect to AMD Athlon 7750, and up to 10.15 on NVIDIA Tesla C2050 with respect to Intel Xeon X5650.
引用
下载
收藏
页码:1 / 19
页数:19
相关论文
共 50 条
  • [41] Design Patterns for High-Performance Matrix Computations
    Son, Hoang M.
    MODELING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2008, : 509 - 519
  • [42] Dynamic Load Balancing for High-Performance Graph Processing on Hybrid CPU-GPU Platforms
    Heldens, Stijn
    Varbanescu, Ana Lucia
    Iosup, Alexandru
    PROCEEDINGS OF 2016 6TH WORKSHOP ON IRREGULAR APPLICATIONS: ARCHITECTURE AND ALGORITHMS (IA3), 2016, : 62 - 65
  • [43] An Autotuning Engine for the 3D Fast Wavelet Transform on Clusters with Hybrid CPU + GPU Platforms
    Gregorio Bernabé
    Javier Cuenca
    Domingo Giménez
    International Journal of Parallel Programming, 2015, 43 : 1160 - 1191
  • [44] Cache performance optimization of irregular sparse matrix multiplication on modern multi-core CPU and GPU
    刘力
    LiuLi
    Yang Guang wen
    High Technology Letters, 2013, 19 (04) : 339 - 345
  • [45] A hybrid format for better performance of sparse matrix-vector multiplication on a GPU
    Guo, Dahai
    Gropp, William
    Olson, Luke N.
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2016, 30 (01): : 103 - 120
  • [46] A Hybrid B plus -tree as Solution for In-Memory Indexing on CPU-GPU Heterogeneous Computing Platforms
    Shahvarani, Amirhesam
    Jacobsen, Hans-Arno
    SIGMOD'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2016, : 1523 - 1538
  • [47] An Autotuning Engine for the 3D Fast Wavelet Transform on Clusters with Hybrid CPU plus GPU Platforms
    Bernabe, Gregorio
    Cuenca, Javier
    Gimenez, Domingo
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2015, 43 (06) : 1160 - 1191
  • [48] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
    Hu, Yichang
    Lu, Lu
    JOURNAL OF SUPERCOMPUTING, 2021, 77 (12): : 13739 - 13756
  • [49] Design of a simulation model for high performance LINPACK in hybrid CPU-GPU systems
    Yichang Hu
    Lu Lu
    The Journal of Supercomputing, 2021, 77 : 13739 - 13756
  • [50] Design of a Hybrid MPI-CUDA Benchmark Suite for CPU-GPU Clusters
    Agarwal, Tejaswi
    Becchi, Michela
    PROCEEDINGS OF THE 23RD INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT'14), 2014, : 505 - 506