Accelerating Lattice Boltzmann Applications with OpenACC

被引:7
|
作者
Calore, Enrico [1 ,2 ]
Kraus, Jiri [3 ]
Schifano, Sebastiano Fabio [2 ,4 ]
Tripiccione, Raffaele [1 ,2 ]
机构
[1] Univ Ferrara, Dip Fis & Sci Terra, I-44100 Ferrara, Italy
[2] Ist Nazl Fis Nucl, Ferrara, Italy
[3] NVIDIA GmbH, Wurselen, Germany
[4] Univ Ferrara, Dip Matemat & Informat, I-44100 Ferrara, Italy
来源
关键词
OpenACC; OpenMPI; Lattice Boltzmann methods; Accelerator computing; Performance analysis; CODE;
D O I
10.1007/978-3-662-48096-0_47
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as accelerators could usually be programmed only using specific programming languages - such as CUDA - threatening maintainability, portability and correctness. Several new programming environments try to tackle this problem; among them OpenACC offers a high-level approach based on directives. In OpenACC, one annotates existing C, C++ or Fortran codes with compiler directive clauses to mark program regions to offload and run on accelerators and to identify available parallelism. This approach directly addresses code portability, leaving to compilers the support of each different accelerator, but one has to carefully assess the relative costs of potentially portable approach versus computing efficiency. In this paper we address precisely this issue, using as a test-bench a massively parallel Lattice Boltzmann code. We implement and optimize this multi-node code using OpenACC and OpenMPI. We also compare performance with that of the same algorithm written in CUDA, OpenCL and C for GPUs, Xeon-Phi and traditional multi-core CPUs, and characterize through an accurate time model its scaling behavior on a large cluster of GPUs.
引用
收藏
页码:613 / 624
页数:12
相关论文
共 50 条
  • [1] Performance and portability of accelerated lattice Boltzmann applications with OpenACC
    Calore, Enrico
    Gabbana, Alessandro
    Kraus, Jiri
    Schifano, Sebastiano Fabio
    Tripiccione, Raffaele
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2016, 28 (12): : 3485 - 3502
  • [2] Acceleration of Lattice Boltzmann Simulation via OpenACC
    Shuhao Guo
    Jie Wu
    [J]. Journal of Harbin Institute of Technology(New series), 2018, 25 (05) : 44 - 52
  • [3] Accelerating the Lattice Boltzmann Method
    Altoyan, Wesson
    Alonso, Juan J.
    [J]. 2023 IEEE AEROSPACE CONFERENCE, 2023,
  • [4] Accelerating Spark-Based Applications with MPI and OpenACC
    Alshahrani, Saeed
    Al Shehri, Waleed
    Almalki, Jameel
    Alghamdi, Ahmed M.
    Alammari, Abdullah M.
    [J]. COMPLEXITY, 2021, 2021
  • [5] Accelerated lattice Boltzmann simulation using GPU and OpenACC with data management
    Xu, A.
    Shi, L.
    Zhao, T. S.
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2017, 109 : 577 - 588
  • [6] Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI
    Xu, Ao
    Li, Bo -Tao
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2023, 201
  • [7] Multi-GPU thermal lattice Boltzmann simulations using OpenACC and MPI
    Xu, Ao
    Li, Bo-Tao
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2023, 201
  • [8] Accelerating Hydrocodes with OpenACC, OpenCL and CUDA
    Herdman, J. A.
    Gaudin, W. P.
    McIntosh-Smith, S.
    Boulton, M.
    Beckingsale, D. A.
    Mallinson, A. C.
    Jarvis, S. A.
    [J]. 2012 SC COMPANION: HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SCC), 2012, : 465 - 471
  • [9] Particle-resolved thermal lattice Boltzmann simulation using OpenACC on multi-GPUs
    Xu, Ao
    Li, Bo-Tao
    [J]. INTERNATIONAL JOURNAL OF HEAT AND MASS TRANSFER, 2024, 218
  • [10] Accelerating the Parallelization of Lattice Boltzmann Method by Exploiting the Temporal Locality
    Liu, Song
    Zou, Nianjun
    Cui, Yuanzhen
    Wu, Weiguo
    [J]. 2017 15TH IEEE INTERNATIONAL SYMPOSIUM ON PARALLEL AND DISTRIBUTED PROCESSING WITH APPLICATIONS AND 2017 16TH IEEE INTERNATIONAL CONFERENCE ON UBIQUITOUS COMPUTING AND COMMUNICATIONS (ISPA/IUCC 2017), 2017, : 1186 - 1193