Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication

被引:18
|
作者
Akbudak, Kadir [1 ]
Selvitopi, Oguz [1 ]
Aykanat, Cevdet [1 ]
机构
[1] Bilkent Univ, Comp Engn Dept, TR-06800 Ankara, Turkey
关键词
Sparse matrix-matrix multiplication; SpGEMM; hypergraph partitioning; graph partitioning; communication cost; bandwidth; latency;
D O I
10.1145/3155292
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We investigate outer-product-parallel, inner-product-parallel, and row-by-row-product-parallel formulations of sparse matrix-matrix multiplication (SpGEMM) on distributed memory architectures. For each of these three formulations, we propose a hypergraph model and a bipartite graph model for distributing SpGEMM computations based on one-dimensional (1D) partitioning of input matrices. We also propose a communication hypergraph model for each formulation for distributing communication operations. The computational graph and hypergraph models adopted in the first phase aim at minimizing the total message volume and balancing the computational loads of processors, whereas the communication hypergraph models adopted in the second phase aim at minimizing the total message count and balancing the message volume loads of processors. That is, the computational partitioning models reduce the bandwidth cost and the communication hypergraph models reduce the latency cost. Our extensive parallel experiments on up to 2048 processors for a wide range of realistic SpGEMM instances show that although the outer-product-parallel formulation scales better, the row-by-row-product-parallel formulation is more viable due to its significantly lower partitioning overhead and competitive scalability. For computational partitioning models, our experimental findings indicate that the proposed bipartite graph models are attractive alternatives to their hypergraph counterparts because of their lower partitioning overhead. Finally, we show that by reducing the latency cost besides the bandwidth cost through using the communication hypergraph models, the parallel SpGEMM time can be further improved up to 32%.
引用
收藏
页数:34
相关论文
共 50 条
  • [1] Hypergraph partitioning for sparse matrix-matrix multiplication
    Ballard G.
    Druinsky A.
    Knight N.
    Schwartz O.
    [J]. ACM Transactions on Parallel Computing, 2016, 3 (03) : 1 - 34
  • [2] Brief Announcement: Hypergraph Partitioning for Parallel Sparse Matrix-Matrix Multiplication
    Ballard, Grey
    Druinsky, Alex
    Knight, Nicholas
    Schwartz, Oded
    [J]. SPAA'15: PROCEEDINGS OF THE 27TH ACM SYMPOSIUM ON PARALLELISM IN ALGORITHMS AND ARCHITECTURES, 2015, : 86 - 88
  • [3] Scaling sparse matrix-matrix multiplication in the accumulo database
    Demirci, Gunduz Vehbi
    Aykanat, Cevdet
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2020, 38 (01) : 31 - 62
  • [4] Scaling sparse matrix-matrix multiplication in the accumulo database
    Gunduz Vehbi Demirci
    Cevdet Aykanat
    [J]. Distributed and Parallel Databases, 2020, 38 : 31 - 62
  • [5] SPMSD: An Partitioning-Strategy for Parallel General Sparse Matrix-Matrix Multiplication on GPU
    Cui, Huanyu
    Wang, Nianbin
    Han, Qilong
    Wang, Ye
    [J]. PARALLEL PROCESSING LETTERS, 2024, 34 (02)
  • [6] SIMULTANEOUS INPUT AND OUTPUT MATRIX PARTITIONING FOR OUTER-PRODUCT-PARALLEL SPARSE MATRIX-MATRIX MULTIPLICATION
    Akbudak, Kadir
    Aykanat, Cevdet
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2014, 36 (05): : C568 - C590
  • [7] Parallel Efficient Sparse Matrix-Matrix Multiplication on Multicore Platforms
    Patwary, Md. Mostofa Ali
    Satish, Nadathur Rajagopalan
    Sundaram, Narayanan
    Park, Jongsoo
    Anderson, Michael J.
    Vadlamudi, Satya Gautam
    Das, Dipankar
    Pudov, Sergey G.
    Pirogov, Vadim O.
    Dubey, Pradeep
    [J]. HIGH PERFORMANCE COMPUTING, ISC HIGH PERFORMANCE 2015, 2015, 9137 : 48 - 57
  • [8] PARALLEL SPARSE MATRIX-MATRIX MULTIPLICATION AND INDEXING: IMPLEMENTATION AND EXPERIMENTS
    Buluc, Aydin
    Gilbert, John R.
    [J]. SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2012, 34 (04): : C170 - C191
  • [9] Partitioning for Parallel Matrix-Matrix Multiplication with Heterogeneous Processors: The Optimal Solution
    DeFlumere, Ashley
    Lastovetsky, Alexey
    Becker, Brett A.
    [J]. 2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 125 - 139
  • [10] Register-Aware Optimizations for Parallel Sparse Matrix-Matrix Multiplication
    Liu, Junhong
    He, Xin
    Liu, Weifeng
    Tan, Guangming
    [J]. INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2019, 47 (03) : 403 - 417