Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units

被引:0
|
作者
XIONG QinGang1
2 Graduate University of Chinese Academy of Sciences
机构
基金
中国国家自然科学基金;
关键词
asynchronous execution; compute unified device architecture; graphic processing unit; lattice Boltzmann method; non-blocking message passing interface; OpenMP;
D O I
暂无
中图分类号
TP391.41 [];
学科分类号
080203 ;
摘要
Many-core processors, such as graphic processing units (GPUs), are promising platforms for intrinsic parallel algorithms such as the lattice Boltzmann method (LBM). Although tremendous speedup has been obtained on a single GPU compared with mainstream CPUs, the performance of the LBM for multiple GPUs has not been studied extensively and systematically. In this article, we carry out LBM simulation on a GPU cluster with many nodes, each having multiple Fermi GPUs. Asynchronous execution with CUDA stream functions, OpenMP and non-blocking MPI communication are incorporated to improve efficiency. The algorithm is tested for two-dimensional Couette flow and the results are in good agreement with the analytical solution. For both the oneand two-dimensional decomposition of space, the algorithm performs well as most of the communication time is hidden. Direct numerical simulation of a two-dimensional gas-solid suspension containing more than one million solid particles and one billion gas lattice cells demonstrates the potential of this algorithm in large-scale engineering applications. The algorithm can be directly extended to the three-dimensional decomposition of space and other modeling methods including explicit grid-based methods.
引用
收藏
页码:707 / 715
页数:9
相关论文
共 50 条
  • [1] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    Xiong QinGang
    Li Bo
    Xu Ji
    Fang XiaoJian
    Wang XiaoWei
    Wang LiMin
    He XianFeng
    Ge Wei
    CHINESE SCIENCE BULLETIN, 2012, 57 (07): : 707 - 715
  • [2] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    XIONG QinGang LI Bo XU Ji FANG XiaoJian WANG XiaoWei WANG LiMin HE XianFeng GE Wei State Key Laboratory of Multiphase Complex Systems Institute of Process Engineering Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China
    Chinese Science Bulletin, 2012, 57 (07) : 707 - 715
  • [3] Efficient graphic processing unit implementation of the chemical-potential multiphase lattice Boltzmann method
    Ye, Yutong
    Zhu, Hongyin
    Zhang, Chaoying
    Wen, Binghai
    INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2021, 35 (01): : 78 - 96
  • [4] A graphic processing unit implementation for the moment representation of the lattice Boltzmann method
    Ferrari, Marco A. A.
    de Oliveira Jr, Waine B. B.
    Lugarini, Alan
    Franco, Admilson T. T.
    Hegele Jr, Luiz A. A.
    INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS, 2023, 95 (07) : 1076 - 1089
  • [5] Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    HIGH PERFORMANCE COMPUTING FOR COMPUTATIONAL SCIENCE - VECPAR 2010, 2011, 6449 : 151 - +
  • [6] Efficient implementation of Jacobi iterative method for large sparse linear systems on graphic processing units
    Abal-Kassim Cheik Ahamed
    Frédéric Magoulès
    The Journal of Supercomputing, 2017, 73 : 3411 - 3432
  • [7] Efficient implementation of Jacobi iterative method for large sparse linear systems on graphic processing units
    Ahamed, Abal-Kassim Cheik
    Magoules, Frederic
    JOURNAL OF SUPERCOMPUTING, 2017, 73 (08): : 3411 - 3432
  • [8] Highly Efficient Implementation of Block Ciphers on Graphic Processing Units for Massively Large Data
    An, SangWoo
    Seo, Seog Chung
    APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [9] Sparse Geometries Handling in Lattice Boltzmann Method Implementation for Graphic Processors
    Tomczak, Tadeusz
    Szafran, Roman G.
    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2018, 29 (08) : 1865 - 1878
  • [10] Graphic processing unit computing of lattice Boltzmann method on a desktop computer
    Liu, Qiang
    Xie, Wei
    Qiu, Liao-Yuan
    Xie, Xue-Shen
    Shanghai Jiaotong Daxue Xuebao/Journal of Shanghai Jiaotong University, 2014, 48 (09): : 1329 - 1333