Reducing Communication Overhead in Multi-GPU Hybrid Solver for 2D Laplace's Equation

被引:1
|
作者
Czapinski, Michal [1 ]
Thompson, Chris [1 ]
Barnes, Stuart [1 ]
机构
[1] Cranfield Univ, Appl Math & Comp Grp, Cranfield MK43 0AL, Beds, England
关键词
Hybrid parallelism; Multiple GPUs; Heterogeneous architectures; Non-blocking communication; Laplace solver; CUDA; CONJUGATE GRADIENTS; GRAPHICS; IMPLEMENTATION; COMPUTATION; OVERLAP;
D O I
10.1007/s10766-013-0293-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
The possibility of porting algorithms to graphics processing units (GPUs) raises significant interest among researchers. The natural next step is to employ multiple GPUs, but communication overhead may limit further performance improvement. In this paper, we investigate techniques reducing overhead on hybrid CPU-GPU platforms, including careful data layout and usage of GPU memory spaces, and use of non-blocking communication. In addition, we propose an accurate automatic load balancing technique for heterogeneous environments. We validate our approach on a hybrid Jacobi solver for 2D Laplace's Equation. Experiments carried out using various graphics hardware and types of connectivity have confirmed that the proposed data layout allows our fastest CUDA kernels to reach the analytical limit for memory bandwidth (up to 106 GB/s on NVidia GTX 480), and that the non-blocking communication significantly reduces overhead, allowing for almost linear speed-up, even when communication is carried out over relatively slow networks.
引用
收藏
页码:1032 / 1047
页数:16
相关论文
共 50 条
  • [1] Reducing Communication Overhead in Multi-GPU Hybrid Solver for 2D Laplace’s Equation
    Michał Czapiński
    Chris Thompson
    Stuart Barnes
    International Journal of Parallel Programming, 2014, 42 : 1032 - 1047
  • [2] Hybrid Multi-GPU Solver Based on Schur Complement Method
    Kopysov, Sergey
    Kuzmin, Igor
    Nedozhogin, Nikita
    Novikov, Alexander
    Sagdeeva, Yulia
    PARALLEL COMPUTING TECHNOLOGIES (PACT 2013), 2013, 7979 : 65 - 79
  • [3] An improved direct linear equation solver using multi-GPU in multi-body dynamics
    Jung, Ji-Hyun
    Bae, Dae-Sung
    ADVANCES IN ENGINEERING SOFTWARE, 2018, 115 : 87 - 102
  • [4] New Generation of WIPL-D In-Core Multi-GPU Solver
    Mrdakovic, Branko Lj.
    Kostic, Milan M.
    Olcan, Dragan I.
    Kolundzija, Branko M.
    2018 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM ON ANTENNAS AND PROPAGATION & USNC/URSI NATIONAL RADIO SCIENCE MEETING, 2018, : 413 - 414
  • [5] TRITON: A Multi-GPU open source 2D hydrodynamic flood model
    Morales-Hernandez, M.
    Sharif, Md B.
    Kalyanapu, A.
    Ghafoor, S. K.
    Dullo, T. T.
    Gangrade, S.
    Kao, S. -C.
    Norman, M. R.
    Evans, K. J.
    ENVIRONMENTAL MODELLING & SOFTWARE, 2021, 141
  • [6] Parallel multi-GPU implementation of fast decoupled power flow solver with hybrid architecture
    Zeng, Lei
    Alawneh, Shadi G.
    Arefifar, Seyed Ali.
    CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS, 2024, 27 (01): : 1125 - 1136
  • [7] Multi-GPU implementation of a hybrid thermal lattice Boltzmann solver using the TheLMA framework
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    COMPUTERS & FLUIDS, 2013, 80 : 269 - 275
  • [8] Parallel multi-GPU implementation of fast decoupled power flow solver with hybrid architecture
    Lei Zeng
    Shadi G. Alawneh
    Seyed Ali. Arefifar
    Cluster Computing, 2024, 27 : 1125 - 1136
  • [9] Multi-GPU simulations of Vlasov's equation using Vlasiator
    Sandroos, A.
    Honkonen, I.
    von Alfthan, S.
    Palmroth, M.
    PARALLEL COMPUTING, 2013, 39 (08) : 306 - 318
  • [10] Multi-GPU accelerated multi-spin Monte Carlo simulations of the 2D Ising model
    Block, Benjamin
    Virnau, Peter
    Preis, Tobias
    COMPUTER PHYSICS COMMUNICATIONS, 2010, 181 (09) : 1549 - 1556