A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver

被引:0
|
作者
Ding, Nan [1 ]
Liu, Yang [2 ]
Williams, Samuel [1 ]
Li, Xiaoye S. [2 ]
机构
[1] Lawrence Berkeley Natl Lab, Computat Res Div, Berkeley, CA 94720 USA
[2] Lawrence Berkeley Natl Lab, Scalable Solvers Grp, Berkeley, CA 94720 USA
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Sparse triangular solve is used in conjunction with Sparse LU for solving sparse linear systems, either as a direct solver or as a preconditioner. As GPUs have become a first-class compute citizen, designing an efficient and scalable SpTRSV on multi-GPU HPC systems is imperative. In this paper, we leverage the advantage of GPU-initiated data transfers of NVSHMEM to implement and evaluate a Multi-GPU SpTRSV. We create a novel producer-consumer paradigm to manage the computation and communication in SpTRSV and implement it using two CUDA streams. Our multi-GPU SpTRSV implementation using CUDA streams achieves a 3.7x speedup when using twelve GPUs (two nodes) relative to our implementation on a single GPU, and up to 6.1x compared to cusparse csrsv2() over the range of one to eighteen GPUs. To further explain the observed performance and explore the key features of matrices to estimate the potential performance benefits when using multi-GPU, we extend the critical path model of SpTRSV to GPUs. We demonstrate the ability of our performance model to understand various aspects of performance and performance bottlenecks on multi-GPU and motivate code optimizations.
引用
收藏
页码:147 / 159
页数:13
相关论文
共 50 条
  • [21] A Multi-GPU Parallel Algorithm in Hypersonic Flow Computations
    Lai, Jianqi
    Li, Hua
    Tian, Zhengyu
    Zhang, Ye
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [22] WarpDrive: Massively Parallel Hashing on Multi-GPU Nodes
    Juenger, Daniel
    Hundt, Christian
    Schmidt, Bertil
    2018 32ND IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2018, : 441 - 450
  • [23] MULTI-GPU PARALLEL IMPLEMENTATION OF SPATIAL-SPECTRAL KERNEL SPARSE REPRESENTATION FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Deng, Weishi
    Wu, Zebin
    Ma, Haoyang
    Wang, Qicong
    Sua, Jin
    Xu, Yang
    Yang, Jiandong
    Wei, Zhihui
    Liu, Hongyi
    IGARSS 2020 - 2020 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2020, : 517 - 520
  • [24] Multi-GPU Parallelization of the NAS Multi-Zone Parallel Benchmarks
    Gonzalez, Marc
    Morancho, Enric
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2021, 32 (01) : 229 - 241
  • [25] YEfficient Parallel Implementations of Sparse Triangular Solves for GPU Architectures
    Li, Ruipeng
    Zhang, Chaoyu
    PROCEEDINGS OF THE 2020 SIAM CONFERENCE ON PARALLEL PROCESSING FOR SCIENTIFIC COMPUTING, PP, 2020, : 106 - 117
  • [26] Multi-GPU parallel acceleration scheme for meshfree peridynamic simulations
    Wang, Xiaoming
    Li, Shirui
    Dong, Weijia
    An, Boyang
    Huang, Hong
    He, Qing
    Wang, Ping
    Lv, Guanren
    THEORETICAL AND APPLIED FRACTURE MECHANICS, 2024, 131
  • [27] Multi-GPU parallel computing and task scheduling under virtualization
    College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing
    210016, China
    Int. J. Hybrid Inf. Technol., 7 (253-266):
  • [28] An efficient parallel collaborative filtering algorithm on multi-GPU platform
    Wang, Zhongya
    Liu, Ying
    Chiu, Steve
    JOURNAL OF SUPERCOMPUTING, 2016, 72 (06): : 2080 - 2094
  • [29] New Generation of WIPL-D In-Core Multi-GPU Solver
    Mrdakovic, Branko Lj.
    Kostic, Milan M.
    Olcan, Dragan I.
    Kolundzija, Branko M.
    2018 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION & USNC/URSI NATIONAL RADIO SCIENCE MEETING, 2018, : 413 - 414
  • [30] Acoustic scattering solver based on single level FMM for multi-GPU systems
    Lopez-Portugues, Miguel
    Lopez-Fernandez, Jesus A.
    Menendez-Canal, Jonatan
    Rodriguez-Campa, Alberto
    Ranilla, Jose
    JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2012, 72 (09) : 1057 - 1064