Massively parallel lattice-Boltzmann codes on large GPU clusters

被引:48
|
作者
Calore, E. [1 ,2 ]
Gabbana, A. [1 ]
Kraus, J. [3 ]
Pellegrini, E. [1 ]
Schifano, S. F. [1 ,2 ]
Tripiccione, R. [1 ,2 ]
机构
[1] Univ Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
[2] INFN Ferrara, Via Saragat 1, I-44122 Ferrara, Italy
[3] NVIDIA GmbH, Adenauerstr 20 A4, D-52146 Wurselen, Germany
关键词
Lattice-Boltzmann; GPU accelerators; Massively parallel programming; Heterogeneous systems; PERFORMANCE; PORTABILITY;
D O I
10.1016/j.parco.2016.08.005
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes a massively parallel code for a state -of-the art thermal lattice-Boltzmann method. Our code has been carefully optimized for performance on one GPU and to have a good scaling behavior extending to a large number of GPUs. Versions of this code have been already used for large-scale studies of convective turbulence. GPUs are becoming increasingly popular in HPC applications, as they are able to deliver higher performance than traditional processors. Writing efficient programs for large clusters is not an easy task as codes must adapt to increasingly parallel architectures, and the overheads of node-to-node communications must be properly handled. We describe the structure of our code, discussing several key design choices that were guided by theoretical models of performance and experimental benchmarks. We present an extensive set of performance measurements and identify the corresponding main bottlenecks; finally we compare the results of our GPU code with those measured on other currently available high performance processors. Our results are a production-grade code able to deliver a sustained performance of several tens of Tflops as well as a design and optimization methodology that can be used for the development of other high performance applications for computational physics. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 24
页数:24
相关论文
共 50 条
  • [31] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    XIONG QinGang LI Bo XU Ji FANG XiaoJian WANG XiaoWei WANG LiMin HE XianFeng GE Wei State Key Laboratory of Multiphase Complex Systems Institute of Process Engineering Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China
    Chinese Science Bulletin, 2012, 57 (07) : 707 - 715
  • [32] XLB: A differentiable massively parallel lattice Boltzmann library in Python']Python
    Ataei, Mohammadmehdi
    Salehipour, Hesam
    COMPUTER PHYSICS COMMUNICATIONS, 2024, 300
  • [33] Massively Parallel A* Search on a GPU
    Zhou, Yichao
    Zeng, Jianyang
    PROCEEDINGS OF THE TWENTY-NINTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2015, : 1248 - 1254
  • [34] Parallel Lattice-Boltzmann Simulation of Transitional Flow on Non-uniform Grids
    Stiebler, Maik
    Freudiger, Soeren
    Krafczyk, Manfred
    Geier, Martin
    COMPUTATIONAL SCIENCE AND HIGH PERFORMANCE COMPUTING IV, 2011, 115 : 283 - 295
  • [35] Lattice-Boltzmann Models for Heat Transfer
    Chenghai SUN
    Communications in Nonlinear Science & Numerical Simulation, 1997, (04) : 212 - 216
  • [36] Lattice-Boltzmann model for bacterial chemotaxis
    Hilpert, M
    JOURNAL OF MATHEMATICAL BIOLOGY, 2005, 51 (03) : 302 - 332
  • [37] Lattice-Boltzmann modeling of dissolution phenomena
    Verhaeghe, F
    Arnout, S
    Blanpain, B
    Wollants, P
    PHYSICAL REVIEW E, 2006, 73 (03):
  • [38] Localized Parallel Algorithm for Bubble Coalescence in Flee Surface Lattice-Boltzmann Method
    Donath, Stefan
    Feichtinger, Christian
    Pohl, Thomas
    Goetz, Jan
    Ruede, Ulrich
    EURO-PAR 2009: PARALLEL PROCESSING, PROCEEDINGS, 2009, 5704 : 735 - 746
  • [39] Comparison of implementations of the lattice-Boltzmann method
    Mattila, Keijo
    Hyvaeluoma, Jari
    Timonen, Jussi
    Rossi, Tuomo
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2008, 55 (07) : 1514 - 1524
  • [40] Lattice-Boltzmann model of amphiphilic systems
    Theissen, O
    Gompper, G
    Kroll, DM
    EUROPHYSICS LETTERS, 1998, 42 (04): : 419 - 424