Sailfish: A flexible multi-GPU implementation of the lattice Boltzmann method

被引:82
|
作者
Januszewski, M. [1 ,2 ]
Kostur, M. [1 ]
机构
[1] Univ Silesia, Inst Phys, PL-40007 Katowice, Poland
[2] Google Switzerland GmbH, CH-8002 Zurich, Switzerland
关键词
Lattice Boltzmann; LBM; Computational fluid dynamics; Graphics processing unit; GPU; CUDA; BOUNDARY-CONDITIONS; BINARY-FLUID; SIMULATION; VISCOSITIES; FLOWS;
D O I
10.1016/j.cpc.2014.04.018
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
We present Sailfish, an open source fluid simulation package implementing the lattice Boltzmann method (LBM) on modern Graphics Processing Units (GPUs) using CUDA/OpenCL. We take a novel approach to GPU code implementation and use run-time code generation techniques and a high level programming language (Python) to achieve state of the art performance, while allowing easy experimentation with different LBM models and tuning for various types of hardware. We discuss the general design principles of the code, scaling to multiple GPUs in a distributed environment, as well as the GPU implementation and optimization of many different LBM models, both single component (BGK, MRT, ELBM) and multicomponent (Shan-Chen, free energy). The paper also presents results of performance benchmarks spanning the last three NVIDIA GPU generations (Tesla, Fermi, Kepler), which we hope will be useful for researchers working with this type of hardware and similar codes. Program Summary Program title: Sailfish Catalogue identifier: AETA_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AETA_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU Lesser General Public License, version 3 No. of lines in distributed program, including test data, etc.: 225864 No. of bytes in distributed program, including test data, etc.: 46861049 Distribution format: tar.gz Programming language: Python, CUDA C, OpenCL. Computer: Any with an OpenCL or CUDA-compliant GPU. Operating system: No limits (tested on Linux and Mac OS X). RAM: Hundreds of megabytes to tens of gigabytes for typical cases. \ Classification: 12, 6.5. External routines: PyCUDA/PyOpenCL, Numpy, Mako, ZeroMQ (for multi-GPU simulations), scipy, sympy Nature of problem: GPU-accelerated simulation of single- and multi-component fluid flows. Solution method: A wide range of relaxation models (LBGK, MRT, regularized LB, ELBM, Shan-Chen, free energy, free surface) and boundary conditions within the lattice Boltzmann method framework. Simulations can be run in single or double precision using one or more GPUs. Restrictions: The lattice Boltzmann method works for low Mach number flows only. Unusual features: The actual numerical calculations run exclusively on GPUs. The numerical code is built dynamically at run-time in CUDA C or OpenCL, using templates and symbolic formulas. The high-level control of the simulation is maintained by a Python process. Additional comments: !!!!!The distribution file for this program is over 45 Mbytes and therefore is not delivered directly when Download or Email is requested. Instead a html file giving details of how the program can be obtained is sent. !!!!! Running time: Problem-dependent, typically minutes (for small cases or short simulations) to hours (large cases or long simulations). (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:2350 / 2368
页数:19
相关论文
共 50 条
  • [1] Multi-GPU implementation of the lattice Boltzmann method
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2013, 65 (02) : 252 - 261
  • [2] The TheLMA project: Multi-GPU implementation of the lattice Boltzmann method
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    [J]. INTERNATIONAL JOURNAL OF HIGH PERFORMANCE COMPUTING APPLICATIONS, 2011, 25 (03): : 295 - 303
  • [3] Implementation of Multi-GPU Based Lattice Boltzmann Method for Flow Through Porous Media
    Huang, Changsheng
    Shi, Baochang
    He, Nanzhong
    Chai, Zhenhua
    [J]. ADVANCES IN APPLIED MATHEMATICS AND MECHANICS, 2015, 7 (01) : 1 - 12
  • [4] Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster
    Xian, Wang
    Takayuki, Aoki
    [J]. PARALLEL COMPUTING, 2011, 37 (09) : 521 - 535
  • [5] Optimizing Communications in multi-GPU Lattice Boltzmann Simulations
    Calore, Enrico
    Marchi, Davide
    Schifano, Sebastiano Fabio
    Tripiccione, Raffaele
    [J]. PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING & SIMULATION (HPCS 2015), 2015, : 55 - 62
  • [6] Multi-GPU implementation of a hybrid thermal lattice Boltzmann solver using the TheLMA framework
    Obrecht, Christian
    Kuznik, Frederic
    Tourancheau, Bernard
    Roux, Jean-Jacques
    [J]. COMPUTERS & FLUIDS, 2013, 80 : 269 - 275
  • [7] A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code
    Biferale, Luca
    Mantovani, Filippo
    Pivanti, Marcello
    Pozzati, Fabio
    Sbragaglia, Mauro
    Scagliarini, Andrea
    Schifano, Sebastiano Fabio
    Toschi, Federico
    Tripiccione, Raffaele
    [J]. PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I, 2012, 7203 : 640 - 650
  • [8] Adjoint Lattice Boltzmann for topology optimization on multi-GPU architecture
    Laniewski-Wollk, L.
    Rokicki, J.
    [J]. COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2016, 71 (03) : 833 - 848
  • [9] Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI
    Xu, Ao
    Li, Bo -Tao
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2023, 201
  • [10] Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI
    Xu, Ao
    Li, Bo-Tao
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2023, 201