Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

被引:0
|
作者
XIONG QinGang1
2 Graduate University of Chinese Academy of Sciences
机构
基金
中国国家自然科学基金;
关键词
asynchronous execution; compute unified device architecture; graphic processing unit; lattice Boltzmann method; non-blocking message passing interface; OpenMP;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the oneand two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods.
引用
收藏
页码:707 / 715
页数:9
相关论文
共 50 条
  • [21] Implementation of a Lattice Boltzmann Method for Large Eddy Simulation on Multiple GPUs
    Li, Qinjian
    Zhong, Chengwen
    Li, Kai
    Zhang, Guangyong
    Lu, Xiaowei
    Zhang, Qing
    Zhao, Kaiyong
    Chu, Xiaowen
    2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 818 - 823
  • [22] GPU parallel implementation of a finite volume lattice Boltzmann method for incompressible flows
    Wen, Mengke
    Shen, Siyuan
    Li, Weidong
    Computers and Fluids, 2024, 285
  • [23] Efficient Implementation of Total FETI Solver for Graphic Processing Units Using Schur Complement
    Riha, Lubomir
    Brzobohaty, Tomas
    Markopoulos, Alexandros
    Kozubek, Tomas
    Meca, Ondrej
    Schenk, Olaf
    Vanroose, Wim
    HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING, HPCSE 2015, 2016, 9611 : 85 - 100
  • [24] Parallel data cube computation on graphic processing units
    Zhou G.-L.
    Chen H.
    Li C.-P.
    Wang S.
    Zheng T.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (10): : 1788 - 1798
  • [25] PARALLEL EFFICIENT METHOD OF MOMENTS EXPLOITING GRAPHICS PROCESSING UNITS
    De Donno, D.
    Esposito, A.
    Monti, G.
    Tarricone, L.
    MICROWAVE AND OPTICAL TECHNOLOGY LETTERS, 2010, 52 (11) : 2568 - 2572
  • [26] Implementation of Iron Loss Model on Graphic Processing Units
    Hussain, Sajid
    Silva, Rodrigo C. P.
    Lowther, David A.
    IEEE TRANSACTIONS ON MAGNETICS, 2016, 52 (03)
  • [27] A parallel lattice-Boltzmann method for large scale simulations of complex fluids
    Nekovee, M
    Chin, J
    González-Segredo, N
    Coveney, PV
    COMPUTATIONAL FLUID DYNAMICS, 2001, : 204 - 212
  • [28] Parallel Lattice Boltzmann Method with Blocked Partitioning
    Schepke, Claudio
    Maillard, Nicolas
    Navaux, Philippe O. A.
    INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING, 2009, 37 (06) : 593 - 611
  • [29] Parallel Lattice Boltzmann Method with Blocked Partitioning
    Claudio Schepke
    Nicolas Maillard
    Philippe O. A. Navaux
    International Journal of Parallel Programming, 2009, 37 : 593 - 611
  • [30] Efficient computation of the geopotential gradient in graphic processing units
    Rubio, Carlos
    Gonzalo, Jesus
    Siminski, Jan
    Escapa, Alberto
    ADVANCES IN SPACE RESEARCH, 2024, 74 (01) : 332 - 347