Implementation of hybrid MPI+OpenMP parallelization on unstructured CFD solver and its applications in massive unsteady simulations

被引：0

作者：

Wang N. ^{[1
]}

Chang X. ^{[1
]}

Zhao Z. ^{[2
]}

Zhang L. ^{[1
,2
]}

机构：

[1] State Key Laboratory of Aerodynamics, China Aerodynamics Research and Development Center, Mianyang

[2] Computational Aerodynamics Institute, China Aerodynamics Research and Development Center, Mianyang

来源：

Hangkong Xuebao/Acta Aeronautica et Astronautica Sinica | 2020年 / 41卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Computational fluid dynamics; MPI+OpenMP hybrid parallelization; Overset grids; Parallel efficiency; Unsteady simulation;

D O I：

10.7527/S1000-6893.2020.23859

中图分类号：

学科分类号：

摘要：

In conventional engineering applications, the computational cost of unsteady flow simulation such as store separation is massive, and becomes even larger if higher accuracy is desired via refining grids or adopting higher order methods. Consequently, unsteady flow simulation is both time-consuming and expensive in CFD engineering applications. Therefore, it is necessary to improve the scalability and efficiency of unsteady flow simulation. To achieve the potential of multi-core CPU processors with both distributed and shared memories, Message Passing Interface (MPI) and OpenMP are adopted for inter-node communication and intra-node shared memory, respectively. This paper firstly implements the MPI+OpenMP hybrid parallelization, both coarse-grain and fine-grain, in our in-house code HyperFLOW. The Common Research Model (CRM) with about 40 million unstructured grid cells is employed to test the implementation on an in-house cluster. The results show that coarse-grain hybrid parallelization is superior at small scales and reaches the highest efficiency at 16 threads, whereas fine-grain is more suitable for large scale parallelization and reaches the highest efficiency at 8 threads. In addition, unstructured overset grids with 0.36 billion cells and 2.88 billion cells are generated for the wing store separation standard model. It only takes dozens of seconds to read the massive grids and complete the overset grids assembly by adopting the P2P (peer to peer) grid reading mode and the optimized overset implicit assembly method. The unsteady store separation process is simulated and parallel efficiency is calculated. The parallel efficiency of 12 288 cores is 90% (based on 768 cores) on the in-house cluster and 70% (based on 384 cores) on the Tianhe 2 supercomputer when 0.36 billion cells are used. The numerical 6 DOF (degree of freedom) results agree well with the experimental data. Finally, for the grid with 2.88 billion cells, parallel efficiency tests are conducted with 4.9×104 CPU cores on the in-house cluster, and the results show that the parallel efficiency reaches 55.3% (based on 4 096 cores). © 2020, Beihang University Aerospace Knowledge Press. All right reserved.

引用

共 24 条

[1] ZHOU Z, HUANG J T, HUANG Y, Et al., CFD technology in aeronautic engineering field: Applications, challenges and development, Acta Aeronautica et Astronautica Sinica, 38, 3, (2017)
[2] YAN C, YU J, XU J L, Et al., On the achievements and prospects for the methods of computational fluid dynamics, Advances in Mechanics, 41, 5, pp. 562-589, (2011)
[3] BLAZEK J., Computational fluid dynamics, principles and applications, (2015)
[4] ZHANG L P, DENG X G, HE L, Et al., The opportunity and grand challenges in computational fluid dynamics by exascale computing, Acta Aerodynamica Sinica, 34, 4, pp. 405-417, (2016)
[5] CHEN G L, SUN G Z, XU Y, Et al., Integrated research of parallel computing: Status and future, Chinese Science Bulletin, 54, 8, pp. 1043-1049, (2009)
[6] WANG T., Tianhe 2" supercomputer, Science, 65, 4, (2013)
[7] ZAHNG Y Q., State-of-art analysis and perspectives of 2015 China HPC, E-science Technology & Application, 6, 6, pp. 83-92, (2015)
[8] YANG G W, ZHAO W L, DING N, Et al., Sunway TaihuLight" supercomputer and its application systems, Science, 69, 3, pp. 12-16, (2017)
[9] ZHANG Y Q., State-of-the-art analysis and perspectives of 2018 China HPC development, Computer Science, 46, 1, pp. 1-5, (2019)
[10] TINOCO E N, BRODERSEN O P, KEYE S, Et al., Summary of data from the sixth AIAA CFD Drag Prediction Workshop: CRM Cases 2 to 5: AIAA-2017-1208, (2017)

← 1 2 3 →