A hierarchical parallel implementation for heterogeneous computing. Application to algebra-based CFD simulations on hybrid supercomputers

被引:13
|
作者
Alvarez-Farre, Xavier [1 ]
Gorobets, Andrey [2 ]
Trias, F. Xavier [1 ]
机构
[1] Tech Univ Catalonia, Heat & Mass Transfer Technol Ctr, Carrer Colom 11, Terrassa 08222, Barcelona, Spain
[2] Russian Acad Sci, Keldysh Inst Appl Math, Miusskaya Sq 4, Moscow 125047, Russia
基金
俄罗斯科学基金会;
关键词
Parallel CFD; SpMV; Heterogeneous computing; Hybrid supercomputer; CPU plus GPU; MPI plus OpenMP plus OpenCL; CUDA;
D O I
10.1016/j.compfluid.2020.104768
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The quest for new portable implementations of simulation algorithms is motivated by the increasing variety of computing architectures. Moreover, the hybridization of high-performance computing systems imposes additional constraints, since heterogeneous computations are needed to efficiently engage processors and massively-parallel accelerators. This, in turn, involves different parallel paradigms and computing frameworks and requires complex data exchanges between computing units. Typically, simulation codes rely on sophisticated data structures and computing subroutines, so-called kernels, which makes portability terribly cumbersome. Thus, a natural way to achieve portability is to dramatically reduce the complexity of both data structures and computing kernels. In our algebra-based approach, the scale-resolving simulation of incompressible turbulent flows on unstructured meshes relies on three fundamental kernels: the sparse matrix-vector product, the linear combination of vectors and the dot product. It is note-worthy that this approach is not limited to a particular kind of numerical method or a set of governing equations. In our code, an auto-balanced multilevel partitioning distributes workload among computing devices of various architectures. The overlap of computations and multistage communications efficiently hides the data exchanges overhead in large-scale supercomputer simulations. In addition to computing on accelerators, special attention is paid at efficiency on manycore processors in multiprocessor nodes with significant non-uniform memory access factor. Parallel efficiency and performance are studied in detail for different execution modes on various supercomputers using up to 9,600 processor cores and up to 256 graphics processor units. The heterogeneous implementation model described in this work is a general-purpose approach that is well suited for various subroutines in numerical simulation codes. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页数:10
相关论文
共 12 条
  • [1] HPC2-A fully-portable, algebra-based framework for heterogeneous computing. Application to CFD
    Alvarez, X.
    Gorobets, A.
    Trias, F. X.
    Borrell, R.
    Oyarzun, G.
    [J]. COMPUTERS & FLUIDS, 2018, 173 : 285 - 292
  • [2] Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers
    Oyarzun, Guillermo
    Borrell, Ricard
    Gorobets, Andrey
    Oliva, Assensi
    [J]. INTERNATIONAL JOURNAL OF COMPUTATIONAL FLUID DYNAMICS, 2017, 31 (09) : 396 - 411
  • [3] The Hierarchical Heterogeneous of Parallel Computing Model Based on Method Library
    Duan, Jibing
    Ji, Xiaopeng
    Dou, Jinye
    Wei, Zhiqiang
    [J]. PROCEEDINGS OF 2013 CHINESE INTELLIGENT AUTOMATION CONFERENCE: INTELLIGENT INFORMATION PROCESSING, 2013, 256 : 255 - 263
  • [4] Implementation of Motion Estimation Based On Heterogeneous Parallel Computing System with OpenCL
    Zhang, Jinglin
    Nezan, Jean-Francois
    Cousin, Jean-Gabriel
    [J]. 2012 IEEE 14TH INTERNATIONAL CONFERENCE ON HIGH PERFORMANCE COMPUTING AND COMMUNICATIONS & 2012 IEEE 9TH INTERNATIONAL CONFERENCE ON EMBEDDED SOFTWARE AND SYSTEMS (HPCC-ICESS), 2012, : 41 - 45
  • [5] A hybrid optimization based on GPU parallel computing method and its application of three dimensional large eddy simulations
    Zhang, Yuxuan
    Wu, Songping
    [J]. APPLIED MECHANICS AND MATERIALS I, PTS 1-3, 2013, 275-277 : 2589 - 2594
  • [6] SunwayLB: Enabling Extreme-Scale Lattice Boltzmann Method Based Computing Fluid Dynamics Simulations on Advanced Heterogeneous Supercomputers
    Liu, Zhao
    Chu, Xuesen
    Lv, Xiaojing
    Meng, Hongsong
    Liu, Hanyue
    Zhu, Guanghui
    Fu, Haohuan
    Yang, Guangwen
    [J]. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2024, 35 (02) : 324 - 337
  • [7] A hybrid MPI-OpenMP parallel implementation for pseudospectral simulations with application to Taylor-Couette flow
    Shi, Liang
    Rampp, Markus
    Hof, Bjoern
    Avila, Marc
    [J]. COMPUTERS & FLUIDS, 2015, 106 : 1 - 11
  • [8] Large-scale homo- and heterogeneous parallel paradigm design based on CFD application PHengLEI
    Wan, Yunbo
    Zhao, Zhong
    Liu, Jie
    Zhang, Laiping
    Zhang, Yong
    Chen, Jianqiang
    [J]. CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (05):
  • [9] A New Hybrid Hierarchical Parallel Algorithm to Enhance the Performance of Large-Scale Structural Analysis Based on Heterogeneous Multicore Clusters
    Yu, Gaoyuan
    Lou, Yunfeng
    Dong, Hang
    Li, Junjie
    Jin, Xianlong
    [J]. CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 136 (01): : 135 - 155
  • [10] A NEW SOFTWARE IMPLEMENTATION OF ULTRASONIC PHASED ARRAY INSPECTION AND 3D IMAGING APPLICATION BASED ON GPU PARALLEL COMPUTING
    Gao, Da-liang
    Shi, Fang-fang
    Zhang, Bi-xing
    [J]. PROCEEDINGS OF 2016 SYMPOSIUM ON PIEZOELECTRICITY, ACOUSTIC WAVES, AND DEVICE APPLICATIONS (SPAWDA), 2016, : 131 - 134