Hybrid MPI and CUDA paralleled finite volume unstructured CFD simulations on a multi-GPU system

被引:10
|
作者
Zhang, Xi [1 ]
Guo, Xiaohu [2 ]
Weng, Yue [1 ]
Zhang, Xianwei [1 ]
Lu, Yutong [1 ]
Zhao, Zhong [3 ]
机构
[1] Sun Yat Sen Univ, Sch Comp Sci & Engn, 132 East Outer Ring Rd, Guangzhou 510006, Guangdong, Peoples R China
[2] STFC Daresbury Lab, Hartree Ctr, Keckwick Lane, Warrington WA4 4AD, England
[3] China Aerodynam Res & Dev Ctr, Computat Aerodynam Inst, 6 South Sect,Second Ring Rd, Mianyang 621000, Sichuan, Peoples R China
基金
英国工程与自然科学研究理事会;
关键词
Computational fluid dynamics; Unstructured mesh; Compressible flow; Graphic processing units; Optimizations; Scalability; SOLVERS;
D O I
10.1016/j.future.2022.09.005
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Porting unstructured Computational Fluid Dynamics (CFD) analysis of compressible flow to Graphics Processing Units (GPUs) confronts two difficulties. Firstly, non-coalescing access to the GPU's global memory is induced by indirect data access leading to performance loss. Secondly, data exchange among multi-GPU is complex due to data communication between processes and transfer between host and device, which degrades scalability. For increasing data locality on unstructured finite volume GPU simulations for compressible flow, we perform some optimizations, including cell and face renumbering, data dependence resolving, nested loops split, and loop mode adjustment. Then, a hybrid MPI-CUDA parallel framework with packing and unpacking exchange data on GPU is established for multi-GPU computing. Finally, after optimizations, the performance of the whole application on a GPU is increased by around 50%. Simulations of ONERA M6 cases on a single GPU (Nvidia Tesla V100) can achieve an average of 13.4 speedup compared to those on 28 CPU cores (Intel Xeon Gold 6132). On the baseline of 2 GPUs, strong scaling results show a parallel efficiency of 42% on 200 GPUs, while weak scaling tests give a parallel efficiency of 82.4% up to 200 GPUs.(c) 2022 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 16
页数:16
相关论文
共 33 条
  • [21] Communication and Load Balancing Optimization for Finite Element Electromagnetic Simulations Using Multi-GPU Workstation
    Dziekonski, Adam
    Sypek, Piotr
    Lamecki, Adam
    Mrozowski, Michal
    IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES, 2017, 65 (08) : 2661 - 2671
  • [22] Implementation of hybrid MPI+OpenMP parallelization on unstructured CFD solver and its applications in massive unsteady simulations
    Wang N.
    Chang X.
    Zhao Z.
    Zhang L.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2020, 41 (10):
  • [23] A multi-GPU finite element computation and hybrid collision handling process framework for brain deformation simulation
    Tian, Ye
    Hu, Yong
    Shen, Xukun
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (01)
  • [24] Investigation of a Dynamic Hybrid RANS/LES Modelling Methodology for Finite-Volume CFD Simulations
    Walters, D. K.
    Bhushan, S.
    Alam, M. F.
    Thompson, D. S.
    FLOW TURBULENCE AND COMBUSTION, 2013, 91 (03) : 643 - 667
  • [25] Investigation of a Dynamic Hybrid RANS/LES Modelling Methodology for Finite-Volume CFD Simulations
    D. K. Walters
    S. Bhushan
    M. F. Alam
    D. S. Thompson
    Flow, Turbulence and Combustion, 2013, 91 : 643 - 667
  • [26] Development of a Parallel Explicit Finite-Volume Euler Equation Solver using the Immersed Boundary Method with Hybrid MPI-CUDA Paradigm
    Kuo, F. A.
    Chiang, C. H.
    Lo, M. C.
    Wu, J. S.
    JOURNAL OF MECHANICS, 2020, 36 (01) : 87 - 102
  • [27] Power Aware Parallel 3-D Finite Element Mesh Refinement Performance Modeling and Analysis With CUDA/MPI on GPU and Multi-Core Architecture
    Ren, Da Qi
    Bracken, Eric
    Polstyanko, Sergey
    Lambert, Nancy
    Suda, Reiji
    Giannacopulos, Dennis D.
    IEEE TRANSACTIONS ON MAGNETICS, 2012, 48 (02) : 335 - 338
  • [28] Acceleration of image reconstruction in 3D Electrical Capacitance Tomography in heterogeneous, multi-GPU system using sparse matrix computations and Finite Element Method
    Kapusta, Pawel
    Majchrowicz, Michal
    Sankowski, Dominik
    Jackowska-Strumillo, Lidia
    PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 679 - 683
  • [29] A finite volume multi-moment method with boundary variation diminishing principle for Euler equation on three-dimensional hybrid unstructured grids
    Deng, Xi
    Xie, Bin
    Xiao, Feng
    COMPUTERS & FLUIDS, 2017, 153 : 85 - 101
  • [30] Arbitrary high-order non-oscillatory scheme on hybrid unstructured grids based on multi-moment finite volume method
    Xie, Bin
    Deng, Xi
    Liao, ShiJun
    Xiao, Feng
    JOURNAL OF COMPUTATIONAL PHYSICS, 2021, 424