High-Performance Matrix-Vector Multiplication on the GPU

被引:0
|
作者
Sorensen, Hans Henrik Brandenborg [1 ]
机构
[1] Tech Univ Denmark, Informat & Math Modelling, Bldg 321, DK-2800 Lyngby, Denmark
关键词
GPU; Matrix-Vector Multiplication; Dense linear algebra;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we develop a high-performance GPU kernel for one of the most popular dense linear algebra operations, the matrix-vector multiplication. The target hardware is the most recent Nvidia Tesla 20-series (Fermi architecture), which is designed from the ground up for scientific computing. We show that it is essentially a matter of fully utilizing the fine-grained parallelism of the many-core GPU in order to achieve high-performance for dense matrix-vector multiplication. We show that auto-tuning can be successfully employed to the GPU kernel so that it performs well for all matrix shapes and sizes.
引用
收藏
页码:377 / 386
页数:10
相关论文
共 50 条
  • [32] Matrix-vector multiplication in a photorefractive crystal
    Liu, B
    Liu, LR
    Shao, L
    Chen, HQ
    OPTICS COMMUNICATIONS, 1998, 146 (1-6) : 34 - 38
  • [33] Faster Online Matrix-Vector Multiplication
    Larsen, Kasper Green
    Williams, Ryan
    PROCEEDINGS OF THE TWENTY-EIGHTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2017, : 2182 - 2189
  • [34] Sparse Matrix-Vector Multiplication on GPGPUs
    Filippone, Salvatore
    Cardellini, Valeria
    Barbieri, Davide
    Fanfarillo, Alessandro
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
  • [35] FAST QMC MATRIX-VECTOR MULTIPLICATION
    Dick, Josef
    Kuo, Frances Y.
    Le Gia, Quoc T.
    Schwab, Christoph
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (03): : A1436 - A1450
  • [36] A Note on the Performance of Sparse Matrix-vector Multiplication with Column Reordering
    Haque, Sardar Anisul
    Hossain, Shahadat
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON COMPUTING, ENGINEERING AND INFORMATION, 2009, : 23 - 26
  • [37] Performance evaluation of the sparse matrix-vector multiplication on modern architectures
    Goumas, Georgios
    Kourtis, Kornilios
    Anastopoulos, Nikos
    Karakasis, Vasileios
    Koziris, Nectarios
    JOURNAL OF SUPERCOMPUTING, 2009, 50 (01): : 36 - 77
  • [38] A P system for matrix-vector multiplication
    Guo, Ping
    Wei, Li Jiao
    Chen, Hai Zhu
    Journal of Computational and Theoretical Nanoscience, 2015, 12 (11) : 4279 - 4288
  • [39] A TASK-SCHEDULING APPROACH FOR EFFICIENT SPARSE SYMMETRIC MATRIX-VECTOR MULTIPLICATION ON A GPU
    Mironowicz, P.
    Dziekonski, A.
    Mrozowski, M.
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2015, 37 (06): : C643 - C666
  • [40] A New Segmentation-Based GPU-Accelerated Sparse Matrix-Vector Multiplication
    He, Kai
    Tan, Sheldon X-D
    Tlelo-Cuautle, Esteban
    Wang, Hai
    Tang, He
    2014 IEEE 57TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS (MWSCAS), 2014, : 1013 - 1016