Optimization and acceleration of flow simulations for CFD on CPU/GPU architecture

被引:11
|
作者
Lei, Jiang [1 ]
Li, Da-li [1 ]
Zhou, Yun-long [1 ]
Liu, Wei [1 ]
机构
[1] Natl Univ Def Technol, Coll Aerosp Sci & Engn, Changsha 410073, Hunan, Peoples R China
关键词
Euler equation; GPU; CUDA; CFD; DIRECT NUMERICAL-SIMULATION; INCOMPRESSIBLE FLOWS; GPU; SOLVER;
D O I
10.1007/s40430-019-1793-9
中图分类号
TH [机械、仪表工业];
学科分类号
0802 ;
摘要
With the increasing requirement of high computational power in computational fluid dynamics (CFD) field, the graphic processing units (GPUs) with great floating-point computing capability play more important roles. This work explores the porting of an Euler solver from central processing units (CPUs) to three different CPU/GPU heterogeneous hardware platforms using MUSCL and NND schemes, and then the computational acceleration of one-dimensional (1D) Riemann problem and two-dimensional (2D) flow past a forward-facing step is investigated. Based on hardware structures, memory models and programming methods, the working manner of heterogeneous systems was firstly introduced in this paper. Subsequently, three different heterogeneous methods employed in the current study were presented in detail, while porting all parts of the solver loop to GPU possessed the best performance among them. Several optimization strategies suitable for the solver were adopted to achieve substantial execution speedups, while using shared memory on GPU was relatively rarely reported in CFD literature. Finally, the simulation of 1D Riemann verified the reliability of the modified codes on GPU, demonstrating strong ability in capturing discontinuities of both schemes. The two cases with their 1D computational domains discretized into 10,000 cells both realized a speedup exceeding 25, compared to that executed on a single-core CPU. In simulation of the 2D step flow, we came to the highest speedups of 260 for MUSCL scheme with 800x400 mesh size and 144 for NND scheme with 400x200 computational domain, respectively.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Column-Stored System Join Optimization on Coupled CPU-GPU Architecture
    Ding, Xiangwu
    Li, Zitong
    PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 184 - 191
  • [22] Acceleration of Turbomachinery Steady Simulations on GPU
    Aissa, Mohamed Hassanine
    Mueller, Lasse
    Verstraete, Tom
    Vuik, Cornelis
    EURO-PAR 2016: PARALLEL PROCESSING WORKSHOPS, 2017, 10104 : 814 - 825
  • [23] GPU Acceleration for Mobile Networking Simulations
    Wagner, Christoph
    Sauter, Fabian
    Karkkainen, Teemu
    Ott, Joerg
    PROCEEDINGS OF THE 2024 THE 25TH INTERNATIONAL WORKSHOP ON MOBILE COMPUTING SYSTEMS AND APPLICATIONS, HOTMOBILE 2024, 2024, : 53 - 59
  • [24] DEMCMC-GPU: An Efficient Multi-Objective Optimization Method with GPU Acceleration on the Fermi Architecture
    Zhu, Weihang
    Yaseen, Ashraf
    Li, Yaohang
    NEW GENERATION COMPUTING, 2011, 29 (02) : 163 - 184
  • [25] DEMCMC-GPU: An Efficient Multi-Objective Optimization Method with GPU Acceleration on the Fermi Architecture
    Weihang Zhu
    Ashraf Yaseen
    Yaohang Li
    New Generation Computing, 2011, 29 : 163 - 184
  • [26] Balancing CPU-GPU Collaborative High-order CFD Simulations on the Tianhe-1A Supercomputer
    Xu, Chuanfu
    Zhang, Lilun
    Deng, Xiaogang
    Fang, Jianbin
    Wang, Guangxue
    Cao, Wei
    Che, Yonggang
    Wang, Yongxian
    Liu, Wei
    2014 IEEE 28TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM, 2014,
  • [27] A CFD Heterogeneous Parallel Solver Based on Collaborating CPU and GPU
    Lai, Jianqi
    Tian, Zhengyu
    Li, Hua
    Pan, Sha
    3RD INTERNATIONAL CONFERENCE ON MECHANICAL AND AERONAUTICAL ENGINEERING (ICMAE 2017), 2018, 326
  • [28] Shared⁃memory parallelization technology of unstructured CFD solver for multi⁃core CPU/many⁃core GPU architecture
    Zhang J.
    Li R.
    Deng L.
    Dai Z.
    Liu J.
    Xu C.
    Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica, 2024, 45 (07):
  • [29] GPU and CPU acceleration of a class of kinetic lattice group models
    Brechtken, Stefan
    COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2014, 67 (02) : 452 - 461
  • [30] Collaborating CPU and GPU for the electromagnetic simulations with the FDTD algorithm
    Xu, Ying
    Ma, Huimin
    Jiang, Rongling
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2017, 29 (04):