Simulation and reconstruction for 3D elastic wave using multi-GPU and CUDA-aware MPI

被引:0
|
作者
Cai, Wei [1 ]
Zhu, Peimin [1 ]
Li, Ziang [1 ]
机构
[1] China Univ Geosci, Sch Geophys & Geomat, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
3D elastic FDTD; Wavefield reconstruction; High performance computing; Domain decomposition technique; Multi-GPU parallel computing; CUDA-aware MPI; REVERSE TIME MIGRATION; PROPAGATION; ACCELERATION; COMPUTATION; INVERSION; FIELD;
D O I
10.1016/j.cageo.2024.105616
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
3D finite -difference time -domain numerical simulation and reconstruction based on the domain decomposition technique are essential parts of high-performance computation for reverse -time migration and full -waveform inversion. However, the low GPU utilization in computing for small -sized models and the tremendous memory consumption for large -sized models may result in low computational efficiency and high memory costs. This paper proposes a contiguous memory management (CMM) method and a variable -order wavefield reconstruction (VWR) method. The CMM allocates the memory of many small -sized arrays used for MPI communications on a larger -sized contiguous memory block, which aims to reduce the number of MPI communications between subdomains and improve the communication bandwidth, thus reducing the MPI time overhead and improving the GPU utilization. Meanwhile, the VWR can flexibly set the number of layers of boundary wavefield used for source wavefield reconstruction according to the host memory capacity and accuracy requirements. Since one layer of boundary wavefield could be stored using the VWR, the memory consumption of host memory can be significantly alleviated. Numerical experiments show that GPU utilization in computing for the model with a size of 121 3 can be improved from 25% to 90% using the CMM method, and the VWR method can reduce memory consumption by about 86% while maintaining good accuracy in wavefield reconstruction. In addition, the issue of how to obtain a domain decomposition scheme with optimal performance is discussed in this paper.
引用
收藏
页数:10
相关论文
共 50 条
  • [1] A multi-GPU and CUDA-aware MPI-based spectral element formulation for ultrasonic wave propagation in solid media
    Li, Feilong
    Zou, Fangxin
    Rao, Jing
    ULTRASONICS, 2023, 134
  • [2] Multi-GPU Kinetic Solvers using MPI and CUDA
    Zabelok, Sergey
    Arslanbekov, Robert
    Kolobov, Vladimir
    PROCEEDINGS OF THE 29TH INTERNATIONAL SYMPOSIUM ON RAREFIED GAS DYNAMICS, 2014, 1628 : 539 - 546
  • [3] Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-Aware version of OpenMPI with application to shallow water flows
    Delmas, Vincent
    Soulaimani, Azzedine
    COMPUTER PHYSICS COMMUNICATIONS, 2022, 271
  • [4] Parallel QR Factorization using Givens Rotations in MPI-CUDA for Multi-GPU
    Tapia-Romero, Miguel
    Meneses-Viveros, Amilcar
    Hernandez-Rubio, Erika
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2020, 11 (05) : 636 - 645
  • [5] Optimizing MPI Communication on Multi-GPU Systems using CUDA Inter-Process Communication
    Potluri, S.
    Wang, H.
    Bureddy, D.
    Singh, A. K.
    Rosales, C.
    Panda, D. K.
    2012 IEEE 26TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS & PHD FORUM (IPDPSW), 2012, : 1848 - 1857
  • [6] A multi-thread scheduling method for 3D CT image reconstruction using multi-GPU
    Zhu, Yining
    Zhao, Yunsong
    Zhao, Xing
    JOURNAL OF X-RAY SCIENCE AND TECHNOLOGY, 2012, 20 (02) : 187 - 197
  • [7] The 3D CT Image Reconstruction based on Multi-Thread Scheduling Using Multi-GPU
    Zhu, Yining
    Zhao, Yunsong
    Zhao, Xing
    MEDICAL IMAGING 2012: PHYSICS OF MEDICAL IMAGING, 2012, 8313
  • [8] A multi-GPU acceleration for 3D imaging of the prostate
    Attardo, E.A.
    Borsic, A.
    Halter, R.J.
    Proceedings - 2011 International Conference on Electromagnetics in Advanced Applications, ICEAA'11, 2011, : 1096 - 1099
  • [9] Petascale XCT: 3D Image Reconstruction with Hierarchical Communications on Multi-GPU Nodes
    Hidayetoglu, Mert
    Bicer, Tekin
    de Gonzalo, Simon Garcia
    Ren, Bin
    De Andrade, Vincent
    Gursoy, Doga
    Kettimuthu, Raj
    Foster, Ian T.
    Hwu, Wen-mei W.
    PROCEEDINGS OF SC20: THE INTERNATIONAL CONFERENCE FOR HIGH PERFORMANCE COMPUTING, NETWORKING, STORAGE AND ANALYSIS (SC20), 2020,
  • [10] A Multi-GPU Design for Large Size Cryo-EM 3D Reconstruction
    Wang, Zihao
    Wan, Xiaohua
    Liu, Zhiyong
    Fan, Qianshuo
    Zhang, Fa
    Tan, Guangming
    2021 IEEE 35TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS), 2021, : 847 - 858