Simulation and reconstruction for 3D elastic wave using multi-GPU and CUDA-aware MPI

被引:0
|
作者
Cai, Wei [1 ]
Zhu, Peimin [1 ]
Li, Ziang [1 ]
机构
[1] China Univ Geosci, Sch Geophys & Geomat, Wuhan, Peoples R China
基金
中国国家自然科学基金;
关键词
3D elastic FDTD; Wavefield reconstruction; High performance computing; Domain decomposition technique; Multi-GPU parallel computing; CUDA-aware MPI; REVERSE TIME MIGRATION; PROPAGATION; ACCELERATION; COMPUTATION; INVERSION; FIELD;
D O I
10.1016/j.cageo.2024.105616
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
3D finite -difference time -domain numerical simulation and reconstruction based on the domain decomposition technique are essential parts of high-performance computation for reverse -time migration and full -waveform inversion. However, the low GPU utilization in computing for small -sized models and the tremendous memory consumption for large -sized models may result in low computational efficiency and high memory costs. This paper proposes a contiguous memory management (CMM) method and a variable -order wavefield reconstruction (VWR) method. The CMM allocates the memory of many small -sized arrays used for MPI communications on a larger -sized contiguous memory block, which aims to reduce the number of MPI communications between subdomains and improve the communication bandwidth, thus reducing the MPI time overhead and improving the GPU utilization. Meanwhile, the VWR can flexibly set the number of layers of boundary wavefield used for source wavefield reconstruction according to the host memory capacity and accuracy requirements. Since one layer of boundary wavefield could be stored using the VWR, the memory consumption of host memory can be significantly alleviated. Numerical experiments show that GPU utilization in computing for the model with a size of 121 3 can be improved from 25% to 90% using the CMM method, and the VWR method can reduce memory consumption by about 86% while maintaining good accuracy in wavefield reconstruction. In addition, the issue of how to obtain a domain decomposition scheme with optimal performance is discussed in this paper.
引用
收藏
页数:10
相关论文
共 50 条
  • [21] 3D Non-Local Means denoising via multi-GPU
    Palma, Giuseppe
    Comerci, Marco
    Alfano, Bruno
    Cuomo, Salvatore
    De Michele, Pasquale
    Piccialli, Francesco
    Borrelli, Pasquale
    2013 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2013, : 495 - 498
  • [22] Multi-GPU, Multi-Node Algorithms for Acceleration of Image Reconstruction in 3D Electrical Capacitance Tomography in Heterogeneous Distributed System
    Majchrowicz, Michal
    Kapusta, Pawel
    Jackowska-Strumillo, Lidia
    Banasiak, Robert
    Sankowski, Dominik
    SENSORS, 2020, 20 (02)
  • [23] 2.5D DEEP LEARNING FOR CT IMAGE RECONSTRUCTION USING A MULTI-GPU IMPLEMENTATION
    Ziabari, Amirkoushyar
    Ye, Dong Hye
    Srivastava, Somesh
    Sauer, Ken D.
    Thibault, Jean-Baptiste
    Bouman, Charles A.
    2018 CONFERENCE RECORD OF 52ND ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2018, : 2044 - 2049
  • [24] Multi-GPU implementations of parallel 3D sweeping algorithms with application to geological folding
    20153401191937
    (1) Department of Engineering and Management, Linköping University, Linköping; SE-58183, Sweden; (2) Simula Research Laboratory, P.O. Box 134, Lysaker; 1325, Norway; (3) Department of Informatics, University of Oslo, P.O. Box 1080, Blindern, Oslo; 0316, Norway, (Elsevier B.V., Netherlands):
  • [25] Multi-GPU Implementations of Parallel 3D Sweeping Algorithms with Application to Geological Folding
    Krishnasamy, Ezhilmathi
    Sourouri, Mohammed
    Cai, Xing
    INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, ICCS 2015 COMPUTATIONAL SCIENCE AT THE GATES OF NATURE, 2015, 51 : 1494 - 1503
  • [26] Power Aware Parallel 3-D Finite Element Mesh Refinement Performance Modeling and Analysis With CUDA/MPI on GPU and Multi-Core Architecture
    Ren, Da Qi
    Bracken, Eric
    Polstyanko, Sergey
    Lambert, Nancy
    Suda, Reiji
    Giannacopulos, Dennis D.
    IEEE TRANSACTIONS ON MAGNETICS, 2012, 48 (02) : 335 - 338
  • [27] Fast and Accurate 3D Compton Cone Projections on GPU Using CUDA
    Cui, Jingyu
    Chinn, Garry
    Levin, Craig S.
    2011 IEEE NUCLEAR SCIENCE SYMPOSIUM AND MEDICAL IMAGING CONFERENCE (NSS/MIC), 2011, : 2572 - 2575
  • [28] Multi-level Fast Multipole Algorithm for 3-D Homogeneous Dielectric Objects Using MPI-CUDA on GPU Cluster
    Tuan Phan
    Nghia Tran
    Kilic, Ozlem
    APPLIED COMPUTATIONAL ELECTROMAGNETICS SOCIETY JOURNAL, 2018, 33 (03): : 335 - 338
  • [29] Acceleration of image reconstruction in 3D Electrical Capacitance Tomography in heterogeneous, multi-GPU system using sparse matrix computations and Finite Element Method
    Kapusta, Pawel
    Majchrowicz, Michal
    Sankowski, Dominik
    Jackowska-Strumillo, Lidia
    PROCEEDINGS OF THE 2016 FEDERATED CONFERENCE ON COMPUTER SCIENCE AND INFORMATION SYSTEMS (FEDCSIS), 2016, 8 : 679 - 683
  • [30] Real-time simulation for 3d tissue deformation with cuda based gpu computing
    Yuan, Zhiyong
    Zhang, Yuanyuan
    Zhao, Jianhui
    Ding, Yihua
    Long, Chengjiang
    Xiong, Lu
    Zhang, Dengyi
    Liang, Guozhong
    Journal of Convergence Information Technology, 2010, 5 (04)