A work-efficient parallel sparse matrix-sparse vector multiplication algorithm

被引:33
|
作者
Azad, Ariful [1 ]
Buluc, Aydin [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
关键词
D O I
10.1109/IPDPS.2017.76
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multiplication (SpMSpV) where the matrix, the input vector, and the output vector are all sparse. SpMSpV is an important primitive in the emerging GraphBLAS standard and is the workhorse of many graph algorithms including breadth-first search, bipartite graph matching, and maximal independent set. As thread counts increase, existing multithreaded SpMSpV algorithms can spend more time accessing the sparse matrix data structure than doing arithmetic. Our shared-memory parallel SpMSpV algorithm is work efficient in the sense that its total work is proportional to the number of arithmetic operations required. The key insight is to avoid each thread individually scan the list of matrix columns. Our algorithm is simple to implement and operates on existing column-based sparse matrix formats. It performs well on diverse matrices and vectors with heterogeneous sparsity patterns. A high-performance implementation of the algorithm attains up to 15x speedup on a 24-core Intel Ivy Bridge processor and up to 49x speedup on a 64-core Intel KNL manycore processor. In contrast to implementations of existing algorithms, the performance of our algorithm is sustained on a variety of different input types include matrices representing scale-free and high-diameter graphs.
引用
收藏
页码:688 / 697
页数:10
相关论文
共 50 条
  • [31] Sparse Matrix-Vector Multiplication on GPGPUs
    Filippone, Salvatore
    Cardellini, Valeria
    Barbieri, Davide
    Fanfarillo, Alessandro
    ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
  • [32] Sparse matrix by vector multiplication on transputer networks
    Doreste, L.
    Navarro, J.J.
    Fernandez, A.
    Proceedings of the IASTED International Symposium on Applied Informatics, 1991,
  • [33] A GPU Framework for Sparse Matrix Vector Multiplication
    Neelima, B.
    Reddy, G. Ram Mohana
    Raghavendra, Prakash S.
    2014 IEEE 13TH INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED COMPUTING (ISPDC), 2014, : 51 - 58
  • [34] Structured sparse matrix-vector multiplication on massively parallel SIMD architectures
    Dehn, T
    Eiermann, M
    Giebermann, K
    Sperling, V
    PARALLEL COMPUTING, 1995, 21 (12) : 1867 - 1894
  • [35] SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision
    Hishinuma, Toshiaki
    Hasegawa, Hidehiko
    Tanaka, Teruo
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 21 - 34
  • [36] A Nested Dissection Partitioning Method for Parallel Sparse Matrix-Vector Multiplication
    Boman, Erik G.
    Wolf, Michael M.
    2013 IEEE CONFERENCE ON HIGH PERFORMANCE EXTREME COMPUTING (HPEC), 2013,
  • [37] THE SCHEDULING OF SPARSE MATRIX-VECTOR MULTIPLICATION ON A MASSIVELY PARALLEL DAP COMPUTER
    ANDERSEN, J
    MITRA, G
    PARKINSON, D
    PARALLEL COMPUTING, 1992, 18 (06) : 675 - 697
  • [38] Load-balanced sparse matrix-vector multiplication on parallel computers
    Nastea, SG
    Frieder, O
    El-Ghazawi, T
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 1997, 46 (02) : 180 - 193
  • [39] Analysis of Partitioning Models and Metrics in Parallel Sparse Matrix-Vector Multiplication
    Kaya, Kamer
    Ucar, Bora
    Catalyuerek, Uemit V.
    PARALLEL PROCESSING AND APPLIED MATHEMATICS (PPAM 2013), PT II, 2014, 8385 : 174 - 184
  • [40] Well balanced sparse matrix-vector multiplication on a parallel heterogeneous system
    Jiogo, C. Dongmo
    Manneback, P.
    Kuonen, P.
    2006 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING, VOLS 1 AND 2, 2006, : 665 - +