A work-efficient parallel sparse matrix-sparse vector multiplication algorithm

被引:33
|
作者
Azad, Ariful [1 ]
Buluc, Aydin [1 ]
机构
[1] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
关键词
D O I
10.1109/IPDPS.2017.76
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
We design and develop a work-efficient multithreaded algorithm for sparse matrix-sparse vector multiplication (SpMSpV) where the matrix, the input vector, and the output vector are all sparse. SpMSpV is an important primitive in the emerging GraphBLAS standard and is the workhorse of many graph algorithms including breadth-first search, bipartite graph matching, and maximal independent set. As thread counts increase, existing multithreaded SpMSpV algorithms can spend more time accessing the sparse matrix data structure than doing arithmetic. Our shared-memory parallel SpMSpV algorithm is work efficient in the sense that its total work is proportional to the number of arithmetic operations required. The key insight is to avoid each thread individually scan the list of matrix columns. Our algorithm is simple to implement and operates on existing column-based sparse matrix formats. It performs well on diverse matrices and vectors with heterogeneous sparsity patterns. A high-performance implementation of the algorithm attains up to 15x speedup on a 24-core Intel Ivy Bridge processor and up to 49x speedup on a 64-core Intel KNL manycore processor. In contrast to implementations of existing algorithms, the performance of our algorithm is sustained on a variety of different input types include matrices representing scale-free and high-diameter graphs.
引用
收藏
页码:688 / 697
页数:10
相关论文
共 50 条
  • [41] TileSpGEMM: A Tiled Algorithm for Parallel Sparse General Matrix-Matrix Multiplication on GPUs
    Niu, Yuyao
    Lu, Zhengyang
    Ji, Haonan
    Song, Shuhui
    Jin, Zhou
    Liu, Weifeng
    PPOPP'22: PROCEEDINGS OF THE 27TH ACM SIGPLAN SYMPOSIUM ON PRINCIPLES AND PRACTICE OF PARALLEL PROGRAMMING, 2022, : 90 - 106
  • [42] An Efficient Sparse Matrix Multiplication for skewed matrix on GPU
    Shah, Monika
    Patel, Vibha
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 1301 - 1306
  • [43] The study of impact of matrix-processor mapping on the parallel sparse matrix-vector multiplication
    Simecek, I.
    Langr, D.
    Srnec, E.
    2013 15TH INTERNATIONAL SYMPOSIUM ON SYMBOLIC AND NUMERIC ALGORITHMS FOR SCIENTIFIC COMPUTING (SYNASC 2013), 2014, : 321 - 328
  • [44] Efficient sparse matrix multiple-vector multiplication using a bitmapped format
    Kannan, Ramaseshan
    2013 20TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING (HIPC), 2013, : 286 - 294
  • [45] Efficient CSR-Based Sparse Matrix-Vector Multiplication on GPU
    Gao, Jiaquan
    Qi, Panpan
    He, Guixia
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2016, 2016
  • [46] Efficient multithreaded untransposed, transposed or symmetric sparse matrix-vector multiplication with the Recursive Sparse Blocks format
    Martone, Michele
    PARALLEL COMPUTING, 2014, 40 (07) : 251 - 270
  • [47] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [48] An efficient parallel EM algorithm: A sparse matrix compaction technique
    Jeng, WM
    Huang, S
    INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOL VI, PROCEEDINGS, 1999, : 2954 - 2959
  • [49] FAST ALGORITHM FOR SPARSE-MATRIX MULTIPLICATION
    SCHOOR, A
    INFORMATION PROCESSING LETTERS, 1982, 15 (02) : 87 - 89
  • [50] An efficient sparse stiffness matrix vector multiplication using compressed sparse row storage format on AMD GPU
    Xing, Longyue
    Wang, Zhaoshun
    Ding, Zhezhao
    Chu, Genshen
    Dong, Lingyu
    Xiao, Nan
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2022, 34 (23):