Method for scalable and performant GPU-accelerated simulation of multiphase compressible flow

被引:1
|
作者
Radhakrishnan, Anand [1 ]
Le Berre, Henry [1 ]
Wilfong, Benjamin [1 ]
Spratt, Jean-Sebastien [3 ]
Rodriguez Jr, Mauro [4 ]
Colonius, Tim [3 ]
Bryngelson, Spencer H. [1 ,2 ]
机构
[1] Georgia Inst Technol, Sch Computat Sci & Engn, Atlanta, GA 30332 USA
[2] Georgia Inst Technol, Daniel Guggenheim Sch Aerosp Engn, Atlanta, GA 30332 USA
[3] CALTECH, Div Engn & Appl Sci, Pasadena, CA 91125 USA
[4] Brown Univ, Sch Engn, Providence, RI 02912 USA
基金
美国国家科学基金会;
关键词
Computational fluid dynamics; Heterogeneous computing; Multiphase flows; RIEMANN PROBLEM; RELAXATION; INTERFACES; FLUIDS;
D O I
10.1016/j.cpc.2024.109238
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Multiphase compressible flows are often characterized by a broad range of space and time scales, entailing large grids and small time steps. Simulations of these flows on CPU-based clusters can thus take several wall-clock days. Offloading the compute kernels to GPUs appears attractive but is memory-bound for many finite-volume and-difference methods, damping speedups. Even when realized, GPU-based kernels lead to more intrusive communication and I/O times owing to lower computation costs. We present a strategy for GPU acceleration of multiphase compressible flow solvers that addresses these challenges and obtains large speedups at scale. We use OpenACC for directive-based offloading of all compute kernels while maintaining low-level control when needed. An established Fortran preprocessor and metaprogramming tool, Fypp, enables otherwise hidden compile-time optimizations. This strategy exposes compile-time optimizations and high memory reuse while retaining readable, maintainable, and compact code. Remote direct memory access realized via CUDA-aware MPI and GPUDirect reduces halo-exchange communication time. We implement this approach in the open-source solver MFC [1]. Metaprogramming results in an 8-times speedup of the most expensive kernels compared to a statically compiled program, reaching 46% of peak FLOPs on modern NVIDIA GPUs and high arithmetic intensity (about 10 FLOPs/byte). In representative simulations, a single NVIDIA A100 GPU is 7-times faster compared to an Intel Xeon Cascade Lake (6248) CPU die, or about 300-times faster compared to a single such CPU core. At the same time, near-ideal (97%) weak scaling is observed for at least 13824 GPUs on OLCF Summit. A strong scaling efficiency of 84% is retained for an 8-times increase in GPU count. Collective I/O, implemented via MPI3, helps ensure the negligible contribution of data transfers (< 1% of the wall time for a typical, large simulation). Large many-GPU simulations of compressible (solid-)liquid-gas flows demonstrate the practical utility of this strategy.
引用
收藏
页数:11
相关论文
共 50 条
  • [21] Simulation of rock fracture process based on GPU-accelerated discrete element method
    Liu, Guang-Yu
    Xu, Wen-Jie
    Govender, Nicolin
    Wilke, Daniel N.
    POWDER TECHNOLOGY, 2021, 377 : 640 - 656
  • [22] GPU-accelerated Monte Carlo simulation of particle coagulation based on the inverse method
    Wei, J.
    Kruis, F. E.
    JOURNAL OF COMPUTATIONAL PHYSICS, 2013, 249 : 67 - 79
  • [23] The Feasibility of Amazon's Cloud Computing Platform for Parallel, GPU-Accelerated, Multiphase-Flow Simulations
    Freniere, Cole
    Pathak, Ashish
    Raessi, Mehdi
    Khanna, Gaurav
    COMPUTING IN SCIENCE & ENGINEERING, 2016, 18 (05) : 68 - 77
  • [24] GPU-accelerated simulation of polydisperse multiphase flows using dual-quadrature-based moment methods
    Santos, Fabio P.
    Lage, Paulo L. C.
    Favero, Jovani L.
    Senocak, Inanc
    CANADIAN JOURNAL OF CHEMICAL ENGINEERING, 2020, 98 (05): : 1211 - 1224
  • [25] Fast weighting method for plasma PIC simulation on GPU-accelerated heterogeneous systems
    Can-qun Yang
    Qiang Wu
    Hui-li Hu
    Zhi-cai Shi
    Juan Chen
    Tao Tang
    Journal of Central South University, 2013, 20 : 1527 - 1535
  • [26] GPU-ACCELERATED SIMULATION ENSEMBLES OF STOCHASTIC REACTION NETWORKS
    Koester, Till
    Herrmann, Leon
    Andelfinger, Philipp
    Uhrmacher, Adelinde
    2022 WINTER SIMULATION CONFERENCE (WSC), 2022, : 2570 - 2581
  • [27] A GPU-Accelerated Envelope-Following Method for Switching Power Converter Simulation
    Liu, Xue-Xin
    Tan, Sheldon X. -D.
    Wang, Hai
    Yu, Hao
    DESIGN, AUTOMATION & TEST IN EUROPE (DATE 2012), 2012, : 1349 - 1354
  • [28] A GPU-Accelerated ADI Method for Transient Thermal Simulation with Parallel Cyclic Reduction
    Jiang, Xin
    Tang, Min
    Mao, Junfa
    2018 INTERNATIONAL APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY SYMPOSIUM IN CHINA (ACES-CHINA 2018), 2018,
  • [29] GPU-accelerated large eddy simulation of stirred tanks
    Shu, Shuli
    Yang, Ning
    CHEMICAL ENGINEERING SCIENCE, 2018, 181 : 132 - 145
  • [30] GPU-accelerated phase field simulation of directional solidification
    Ang Gao
    YanSu Hu
    ZhiJun Wang
    DeJun Mu
    JunJie Li
    JinCheng Wang
    Science China Technological Sciences, 2014, 57 : 1191 - 1197