Optimization and acceleration of flow simulations for CFD on CPU/GPU architecture

被引:11
|
作者
Lei, Jiang [1 ]
Li, Da-li [1 ]
Zhou, Yun-long [1 ]
Liu, Wei [1 ]
机构
[1] Natl Univ Def Technol, Coll Aerosp Sci & Engn, Changsha 410073, Hunan, Peoples R China
关键词
Euler equation; GPU; CUDA; CFD; DIRECT NUMERICAL-SIMULATION; INCOMPRESSIBLE FLOWS; GPU; SOLVER;
D O I
10.1007/s40430-019-1793-9
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
With the increasing requirement of high computational power in computational fluid dynamics (CFD) field, the graphic processing units (GPUs) with great floating-point computing capability play more important roles. This work explores the porting of an Euler solver from central processing units (CPUs) to three different CPU/GPU heterogeneous hardware platforms using MUSCL and NND schemes, and then the computational acceleration of one-dimensional (1D) Riemann problem and two-dimensional (2D) flow past a forward-facing step is investigated. Based on hardware structures, memory models and programming methods, the working manner of heterogeneous systems was firstly introduced in this paper. Subsequently, three different heterogeneous methods employed in the current study were presented in detail, while porting all parts of the solver loop to GPU possessed the best performance among them. Several optimization strategies suitable for the solver were adopted to achieve substantial execution speedups, while using shared memory on GPU was relatively rarely reported in CFD literature. Finally, the simulation of 1D Riemann verified the reliability of the modified codes on GPU, demonstrating strong ability in capturing discontinuities of both schemes. The two cases with their 1D computational domains discretized into 10,000 cells both realized a speedup exceeding 25, compared to that executed on a single-core CPU. In simulation of the 2D step flow, we came to the highest speedups of 260 for MUSCL scheme with 800x400 mesh size and 144 for NND scheme with 400x200 computational domain, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] An Acceleration Toolkit of MATLAB based on Hybrid CPU/GPU Clusters
    Liang, Tyng-Yeu
    Wu, Jyun-Kai
    Chen, Yu-Chih
    2013 IEEE 16TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2013), 2013, : 50 - 57
  • [32] Unleashing CPU-GPU Acceleration for Control Theory Applications
    Benner, Peter
    Ezzatti, Pablo
    Quintana-Orti, Enrique S.
    Remon, Alfredo
    EURO-PAR 2012: PARALLEL PROCESSING WORKSHOPS, 2013, 7640 : 102 - 111
  • [33] Accelerating MapReduce on a Coupled CPU-GPU Architecture
    Chen, Linchuan
    Huo, Xin
    Agrawal, Gagan
    2012 INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC), 2012,
  • [34] Parallel Graph Partitioning on a CPU-GPU Architecture
    Goodarzi, Bahareh
    Burtscher, Martin
    Goswami, Dhrubajyoti
    2016 IEEE 30TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW), 2016, : 58 - 66
  • [35] Performance comparison of CPU and GPU on a discrete heterogeneous architecture
    Thomas, Winnie
    Daruwala, Rohin D.
    2014 INTERNATIONAL CONFERENCE ON CIRCUITS, SYSTEMS, COMMUNICATION AND INFORMATION TECHNOLOGY APPLICATIONS (CSCITA), 2014, : 271 - 276
  • [36] CPU-GPU architecture for active noise control
    Kim, Yeongseok
    Park, Youngjin
    APPLIED ACOUSTICS, 2019, 153 : 1 - 13
  • [37] GPU Acceleration of a High-Order CFD Program
    Wang, Shengxiang
    Wang, Wei
    Che, Yonggang
    HP3C 2020: PROCEEDINGS OF THE 2020 4TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPILATION, COMPUTING AND COMMUNICATIONS, 2020, : 123 - 128
  • [38] Morton-Ordered GPU Lattice Boltzmann CFD Simulations with Application to Blood Flow
    Gallagher, Gerald
    Boyle, Fergal J.
    INTERNATIONAL CONFERENCE ON NUMERICAL ANALYSIS AND APPLIED MATHEMATICS 2022, ICNAAM-2022, 2024, 3094
  • [39] Performance optimization of non-equilibrium ionization simulations from MapReduce and GPU acceleration
    Xiao, Jian
    Long, Min
    Yu, Ce
    Zhou, Xin
    Ji, Li
    PARALLEL COMPUTING, 2020, 98
  • [40] GPU acceleration of DEMO particle exhaust simulations
    Varoutis, Stylianos
    Tantos, Christos
    Day, Christian
    PLASMA PHYSICS AND CONTROLLED FUSION, 2021, 63 (10)