Panda: A Compiler Framework for Concurrent CPUGPU Execution of 3D Stencil Computations on GPU-accelerated Supercomputers

被引:0
|
作者
Sourouri, Mohammed [1 ,2 ]
Baden, Scott B. [3 ]
Cai, Xing [1 ,2 ]
机构
[1] Simula Res Lab, Oslo, Norway
[2] Univ Oslo, Dept Informat, Oslo, Norway
[3] Univ Calif San Diego, Dept Comp Sci & Engn, San Diego, CA 92103 USA
关键词
Source-to-source translation; Code generation; Code optimization; CUDA; OpenMP; MPI; Stencil computation; Heterogeneous computing; CPU plus GPU computing; CODE;
D O I
10.1007/s10766-016-0454-1
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We present a new compiler framework for truly heterogeneous 3D stencil computation on GPU clusters. Our framework consists of a simple directive-based programming model and a tightly integrated source-to-source compiler. Annotated with a small number of directives, sequential stencil C codes can be automatically parallelized for large-scale GPU clusters. The most distinctive feature of the compiler is its capability to generate hybrid MPICUDAOpenMP code that uses concurrent CPUGPU computing to unleash the full potential of powerful GPU clusters. The auto-generated hybrid codes hide the overhead of various data motion by overlapping them with computation. Test results on the Titan supercomputer and the Wilkes cluster show that auto-translated codes can achieve about 90 % of the performance of highly optimized handwritten codes, for both a simple stencil benchmark and a real-world application in cardiac modeling. The user-friendliness and performance of our domain-specific compiler framework allow harnessing the full power of GPU-accelerated supercomputing without painstaking coding effort.
引用
收藏
页码:711 / 729
页数:19
相关论文
共 50 条
  • [41] GPU-accelerated 3D mipmap for real-time visualization of ultrasound volume data
    Kwon, Koojoo
    Lee, Eun-Seok
    Shin, Byeong-Seok
    COMPUTERS IN BIOLOGY AND MEDICINE, 2013, 43 (10) : 1382 - 1389
  • [42] Globally Consistent 3D LiDAR Mapping With GPU-Accelerated GICP Matching Cost Factors
    Koide, Kenji
    Yokozuka, Masashi
    Oishi, Shuji
    Banno, Atsuhiko
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2021, 6 (04): : 8591 - 8598
  • [43] A GPU-Accelerated 3D Mesh Deformation Method Based on Radial Basis Function Interpolation
    He, Jiandong
    Wu, Chong
    Jia, Yining
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2022, 2022
  • [44] GPU-accelerated 3D reconstruction of porous media using multiple-point statistics
    Zhang, Ting
    Du, Yi
    Huang, Tao
    Li, Xue
    COMPUTATIONAL GEOSCIENCES, 2015, 19 (01) : 79 - 98
  • [45] GPU-accelerated 3D reconstruction of porous media using multiple-point statistics
    Ting Zhang
    Yi Du
    Tao Huang
    Xue Li
    Computational Geosciences, 2015, 19 : 79 - 98
  • [46] A GPU-accelerated 3D ISPH-TLSPH framework for patient-specific simulations of cardiovascular fluid-structure interactions
    Lu, Yao
    Wu, Peishuo
    Liu, Moubin
    Zhu, Chi
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2024, 428
  • [47] Fast and accurate GPU-accelerated, high-resolution 3D registration for the robotic 3D reconstruction of compliant food objects
    Isachsen, Ulrich Johan
    Theoharis, Theoharis
    Misimi, Ekrem
    COMPUTERS AND ELECTRONICS IN AGRICULTURE, 2021, 180
  • [48] Automatic mapping of multiplexed social receptive fields by deep learning and GPU-accelerated 3D videography
    Ebbesen, Christian L.
    Froemke, Robert C.
    NATURE COMMUNICATIONS, 2022, 13 (01)
  • [49] GLIM: 3D range-inertial localization and mapping with GPU-accelerated scan matching factors
    Koide, Kenji
    Yokozuka, Masashi
    Oishi, Shuji
    Banno, Atsuhiko
    ROBOTICS AND AUTONOMOUS SYSTEMS, 2024, 179
  • [50] Numerical investigation on the water entry of a 3D circular cylinder based on a GPU-accelerated SPH method
    Zhang, Huashan
    Zhang, Zhilang
    He, Fang
    Liu, Moubin
    EUROPEAN JOURNAL OF MECHANICS B-FLUIDS, 2022, 94 : 1 - 16