Balancing Computation and Communication in Distributed Sparse Matrix-Vector Multiplication

被引:0
|
作者
Mi, Hongli [1 ]
Yu, Xiangrui [1 ]
Yu, Xiaosong [1 ]
Wu, Shuangyuan [1 ]
Liu, Weifeng [1 ]
机构
[1] China Univ Petr, Super Sci Software Lab, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Distributed memory system; sparse matrixvector multiplication; load balancing; PARTITIONING MODELS; ALGORITHM; FORMAT;
D O I
10.1109/CCGRID57682.2023.00056
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Sparse Matrix-Vector Multiplication (SpMV) is a fundamental operation in a number of scientific and engineering problems. When the sparse matrices processed are large enough, distributed memory systems should be used to accelerate SpMV. At present, the optimization techniques for distributed SpMV mainly focus on reordering through graph or hypergraph partitioning. However, although the reordering could reduce the amount of communications in general, there are still load balancing challenges in computations and communications on distributed platforms that are not well addressed. In this paper, we propose two strategies to optimize SpMV on distributed clusters: (1) resizing the number of row blocks on the nodes for balancing the amount of computations, and (2) adjusting the column number of the diagonal blocks for balancing tasks and reducing communications among compute nodes. The experimental results show that compared with the classic distributed SpMV implementation and its variant reordered with graph partitioning, our algorithm achieves on average 77.20x and 5.18x (up to 460.52x and 27.50x) speedups, respectively. Also, our method bring on average 19.56x (up to 48.49x) speedup over a recently proposed hybrid distributed SpMV algorithm. In addition, our algorithm achieves obviously better scalability over these existing distributed SpMV methods.
引用
收藏
页码:535 / 544
页数:10
相关论文
共 50 条
  • [1] Communication balancing in parallel sparse matrix-vector multiplication
    Bisseling, RH
    Meesen, W
    [J]. ELECTRONIC TRANSACTIONS ON NUMERICAL ANALYSIS, 2005, 21 : 47 - 65
  • [2] Load-balancing in sparse matrix-vector multiplication
    Nastea, SG
    Frieder, O
    ElGhazawi, T
    [J]. EIGHTH IEEE SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING, PROCEEDINGS, 1996, : 218 - 225
  • [3] Sparse Matrix-Vector Multiplication on GPGPUs
    Filippone, Salvatore
    Cardellini, Valeria
    Barbieri, Davide
    Fanfarillo, Alessandro
    [J]. ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE, 2017, 43 (04):
  • [4] An Efficient Sparse Matrix-Vector Multiplication on Distributed Memory Parallel Computers
    Shahnaz, Rukhsana
    Usman, Anila
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2007, 7 (01): : 77 - 82
  • [5] GPU accelerated sparse matrix-vector multiplication and sparse matrix-transpose vector multiplication
    Tao, Yuan
    Deng, Yangdong
    Mu, Shuai
    Zhang, Zhenzhong
    Zhu, Mingfa
    Xiao, Limin
    Ruan, Li
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2015, 27 (14): : 3771 - 3789
  • [6] Evaluating the Performance Impact of Communication Imbalance in Sparse Matrix-Vector Multiplication
    Utrera, Gladys
    Gil, Marisa
    Martorell, Xavier
    [J]. 23RD EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED, AND NETWORK-BASED PROCESSING (PDP 2015), 2015, : 321 - 328
  • [7] Vector ISA extension for sparse matrix-vector multiplication
    Vassiliadis, S
    Cotofana, S
    Stathis, P
    [J]. EURO-PAR'99: PARALLEL PROCESSING, 1999, 1685 : 708 - 715
  • [8] Node aware sparse matrix-vector multiplication
    Bienz, Amanda
    Gropp, William D.
    Olson, Luke N.
    [J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2019, 130 : 166 - 178
  • [9] Sparse Matrix-Vector Multiplication on a Reconfigurable Supercomputer
    DuBois, David
    DuBois, Andrew
    Connor, Carolyn
    Poole, Steve
    [J]. PROCEEDINGS OF THE SIXTEENTH IEEE SYMPOSIUM ON FIELD-PROGRAMMABLE CUSTOM COMPUTING MACHINES, 2008, : 239 - +
  • [10] Understanding the performance of sparse matrix-vector multiplication
    Goumas, Georgios
    Kourtis, Kornilios
    Anastopoulos, Nikos
    Karakasis, Vasileios
    Koziris, Nectarios
    [J]. PROCEEDINGS OF THE 16TH EUROMICRO CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING, 2008, : 283 - +