Exploring the Design Space of Distributed Parallel Sparse Matrix-Multiple Vector Multiplication

被引:0
|
作者
Huang, Hua [1 ]
Chow, Edmond [1 ]
机构
[1] Georgia Inst Technol, Sch Computat Sci, Engn, Atlanta, GA 30332 USA
关键词
Sparse matrices; Partitioning algorithms; Vectors; Costs; Three-dimensional displays; Space exploration; Optimization; SpMM; SpMV; distributed-memory matrix multiplication; communication optimization; OPTIMIZATION; PERFORMANCE; FRAMEWORK;
D O I
10.1109/TPDS.2024.3452478
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We consider the distributed memory parallel multiplication of a sparse matrix by a dense matrix (SpMM). The dense matrix is often a collection of dense vectors. Standard implementations will multiply the sparse matrix by multiple dense vectors at the same time, to exploit the computational efficiencies therein. But such approaches generally utilize the same sparse matrix partitioning as if multiplying by a single vector. This article explores the design space of parallelizing SpMM and shows that a coarser grain partitioning of the matrix combined with a column-wise partitioning of the block of vectors can often require less communication volume and achieve higher SpMM performance. An algorithm is presented that chooses a process grid geometry for a given number of processes to optimize the performance of parallel SpMM. The algorithm can augment existing graph partitioners by utilizing the additional concurrency available when multiplying by multiple dense vectors to further reduce communication.
引用
下载
收藏
页码:1977 / 1988
页数:12
相关论文
共 50 条
  • [11] A work-efficient parallel sparse matrix-sparse vector multiplication algorithm
    Azad, Ariful
    Buluc, Aydin
    2017 31ST IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2017, : 688 - 697
  • [12] Towards a fast parallel sparse symmetric matrix-vector multiplication
    Geus, R
    Röllin, S
    PARALLEL COMPUTING, 2001, 27 (07) : 883 - 896
  • [13] Merge-based Parallel Sparse Matrix-Vector Multiplication
    Merrill, Duane
    Garland, Michael
    SC '16: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS, 2016, : 678 - 689
  • [14] Balancing Computation and Communication in Distributed Sparse Matrix-Vector Multiplication
    Mi, Hongli
    Yu, Xiangrui
    Yu, Xiaosong
    Wu, Shuangyuan
    Liu, Weifeng
    2023 IEEE/ACM 23RD INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND INTERNET COMPUTING, CCGRID, 2023, : 535 - 544
  • [15] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
    Tao, Yuan
    Deng, Yangdong
    Mu, Shuai
    Zhang, Zhenzhong
    Zhu, Mingfa
    Xiao, Limin
    Ruan, Li
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
  • [16] SIMD Parallel Sparse Matrix-Vector and Transposed-Matrix-Vector Multiplication in DD Precision
    Hishinuma, Toshiaki
    Hasegawa, Hidehiko
    Tanaka, Teruo
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2016, 2017, 10150 : 21 - 34
  • [17] Design space exploration for sparse matrix-matrix multiplication on FPGAs
    Lin, Colin Yu
    Wong, Ngai
    So, Hayden Kwok-Hay
    INTERNATIONAL JOURNAL OF CIRCUIT THEORY AND APPLICATIONS, 2013, 41 (02) : 205 - 219
  • [18] Parallel device for sparse matrix multiplication
    Vyzhikovski, R.
    Kanevskii, Yu.S.
    Maslennikov, O.V.
    Engineering Simulation, 1993, 11 (03): : 412 - 422
  • [19] Multiple-precision sparse matrix-vector multiplication on GPUs
    Isupov, Konstantin
    JOURNAL OF COMPUTATIONAL SCIENCE, 2022, 61
  • [20] Structured sparse matrix-vector multiplication on massively parallel SIMD architectures
    Dehn, T
    Eiermann, M
    Giebermann, K
    Sperling, V
    PARALLEL COMPUTING, 1995, 21 (12) : 1867 - 1894