Implementation of hybrid MPI+OpenMP parallelization on unstructured CFD solver and its applications in massive unsteady simulations

被引:0
|
作者
Wang N. [1 ]
Chang X. [1 ]
Zhao Z. [2 ]
Zhang L. [1 ,2 ]
机构
[1] State Key Laboratory of Aerodynamics, China Aerodynamics Research and Development Center, Mianyang
[2] Computational Aerodynamics Institute, China Aerodynamics Research and Development Center, Mianyang
基金
中国国家自然科学基金;
关键词
Computational fluid dynamics; MPI+OpenMP hybrid parallelization; Overset grids; Parallel efficiency; Unsteady simulation;
D O I
10.7527/S1000-6893.2020.23859
中图分类号
学科分类号
摘要
In conventional engineering applications, the computational cost of unsteady flow simulation such as store separation is massive, and becomes even larger if higher accuracy is desired via refining grids or adopting higher order methods. Consequently, unsteady flow simulation is both time-consuming and expensive in CFD engineering applications. Therefore, it is necessary to improve the scalability and efficiency of unsteady flow simulation. To achieve the potential of multi-core CPU processors with both distributed and shared memories, Message Passing Interface (MPI) and OpenMP are adopted for inter-node communication and intra-node shared memory, respectively. This paper firstly implements the MPI+OpenMP hybrid parallelization, both coarse-grain and fine-grain, in our in-house code HyperFLOW. The Common Research Model (CRM) with about 40 million unstructured grid cells is employed to test the implementation on an in-house cluster. The results show that coarse-grain hybrid parallelization is superior at small scales and reaches the highest efficiency at 16 threads, whereas fine-grain is more suitable for large scale parallelization and reaches the highest efficiency at 8 threads. In addition, unstructured overset grids with 0.36 billion cells and 2.88 billion cells are generated for the wing store separation standard model. It only takes dozens of seconds to read the massive grids and complete the overset grids assembly by adopting the P2P (peer to peer) grid reading mode and the optimized overset implicit assembly method. The unsteady store separation process is simulated and parallel efficiency is calculated. The parallel efficiency of 12 288 cores is 90% (based on 768 cores) on the in-house cluster and 70% (based on 384 cores) on the Tianhe 2 supercomputer when 0.36 billion cells are used. The numerical 6 DOF (degree of freedom) results agree well with the experimental data. Finally, for the grid with 2.88 billion cells, parallel efficiency tests are conducted with 4.9×104 CPU cores on the in-house cluster, and the results show that the parallel efficiency reaches 55.3% (based on 4 096 cores). © 2020, Beihang University Aerospace Knowledge Press. All right reserved.
引用
收藏
相关论文
共 24 条
  • [11] HE X, ZHAO Z, MA R, Et al., Validation of HyperFLOW in subsonic and transonic flow, Acta Aerodynamica Sinica, 34, 2, pp. 267-275, (2016)
  • [12] HE X, HE X Y, HE L, Et al., HyperFLOW: A structured/unstructured hybrid integrated computational environment for multi-purpose fluid simulation, Procedia Engineering, 126, pp. 645-649, (2015)
  • [13] ZHAO Z, ZHANG L P, HE L, Et al., PHengLEI: A large scale parallel CFD framework for arbitrary grids, Chinese Journal of Computers, 42, 11, pp. 2368-2383, (2019)
  • [14] WANG N H, LI M, ZHANG L P., Accuracy analysis and improvement of viscous flux schemes in unstructured second-order finite-volume discretization, Chinese Journal of Theoretical and Applied Mechanics, 50, 3, pp. 527-537, (2018)
  • [15] CHAPMAN B, JOST G, VAN DER PAS R., Using OpenMP, portable shared memory parallel programming, pp. 115-118, (2010)
  • [16] ZHAO Z, ZHANG Y, HE L, Et al., A large-scale parallel hybrid grid generation technique for realistic complex geometry, International Journal for Numerical Methods in Fluids, (2020)
  • [17] CHANG X H, MA R, ZHANG L P., Parallel implicit hole-cutting method for unstructured overset grid, Acta Aeronautica et Astronautica Sinica, 39, 6, pp. 48-58, (2018)
  • [18] CHANG X H, WANG N H, MA R, Et al., Dynamic hybrid mesh generator coupled with overset and deformation in parallel environment, Physics of Gases, 4, 6, pp. 12-21, (2019)
  • [19] CHANG X H, MA R, WANG N H, Et al., A parallel implicit hole-cutting method based on background mesh for unstructured chimera grid, Computers and Fluids, 198, (2020)
  • [20] ZHANG L P, CHANG X H, MA R, Et al., A CFD-based numerical virtual flight simulator and its application in control law design of a maneuverable missile model, Chinese Journal of Aeronautics, 32, 12, pp. 2577-2591, (2019)