Accelerating underwater acoustic propagation modeling using general purpose graphic processing units

被引：0

作者：

Hursky, Paul ^{[1
]}

Porter, Michael B. ^{[1
]}

机构：

[1] Heat Light & Sound Res Inc, La Jolla, CA USA

来源：

OCEANS 2011 | 2011年

关键词：

Split-step Fourier parabolic equation; high-performance computing; general purpose graphic processing unit; CUDA; FINITE-DIFFERENCE TREATMENT; PARABOLIC EQUATION METHOD; STEP PADE SOLUTION; INTERFACE CONDITIONS; WAVE-EQUATION;

D O I：

暂无

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

The quest for raw computing power has shifted from increasing processor clock speeds to increasing the number of processing cores. Currently, mainstream CPUs can be purchased in dual-slot quad-core and hex-core configurations. On the other hand, graphic cards provide hundreds of processing cores. Although there have been various implementations of scientific applications on graphics hardware, including underwater acoustic modeling, widespread use of this technology has been hampered by the often extraordinary effort needed to program this hardware, especially if the application architecture did not match the canonical graphics pipeline for gaming. In the last few years, the major graphics board manufacturers have stepped away from designing hardware specialized for particular new graphic special effects and made a concerted effort to provide general-purpose computing capabilities, of the sort that can be exploited for scientific computing. For example, Nvidia's CUDA environment currently provides many building blocks for scientific computing, such as (subsets of) BLAS, LAPACK, and FFTs. We will present our experiences implementing the split-step Fourier parabolic equation (PE) model in NVIDIA's "Compute Unified Device Architecture" or CUDA environment, showing how we have achieved a 10 times speedup relative to a multi-core CPU implementation, with a modest investment in programming effort. In the repertoire of wave propagation modeling approaches, a parabolic equation model is typically used for range-dependent problems in situations when a ray tracing approach would not provide enough fidelity (e. g. because a high frequency approximation was not warranted for the waveguide being modeled). PE models are narrowband models, so a broadband application would require running multiple frequencies to cover the band of interest, followed by a synthesis via inverse FFT to form the predicted time-domain waveform, which has obvious opportunities for parallelization. This application was initially selected because its key software component, the FFT, was available in a mature GPU-based implementation. In addition, a multi-core CPU implementation of the FFT was also available, enabling a very direct comparison of CPU versus GPU performance using nearly identical code bases. We will describe the key steps needed to adapt this model to the GPU architecture. For example, an important aspect of accelerating applications on GPU architectures is effectively taking advantage of the features of the different memory types that reside on GPUs. Since the bandwidth between cores within the GPU is 5-10 times greater than the bandwidth from the CPU to the GPU, it is important to minimize the amount of data transferred in and out of the GPU. Fortunately, GPUs also have a type of memory called texture memory, which conveniently provides hardware accelerated interpolation -thus, a sparse representation of the range-dependent waveguide parameters (sound speed profile, bathymetry, geo-acoustic parameters of the seabed) can be loaded into texture memory, where it can be interpolated to the resolution required by the PE calculations. We will present benchmark comparisons between our GPU-based PE implementation and two other PE approaches on several canonical range-dependent modeling problems, comparing accuracy and degree of acceleration.

引用

页数：6

共 50 条

[1] Using General-Purpose Graphic Processing Units for BCI Systems
Wilson, J. Adam
[J]. 2011 ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC), 2011, : 4625 - 4628
[2] Accelerating in-memory transaction processing using general purpose graphics processing units
Gao, Lan
Xu, Yunlong
Wang, Rui
Yang, Hailong
Luan, Zhongzhi
Qian, Depei
[J]. FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2019, 97 : 836 - 848
[3] Accelerating VASP electronic structure calculations using graphic processing units
Hacene, Mohamed
Anciaux-Sedrakian, Ani
Rozanska, Xavier
Klahr, Diego
Guignon, Thomas
Fleurat-Lessard, Paul
[J]. JOURNAL OF COMPUTATIONAL CHEMISTRY, 2012, 33 (32) : 2581 - 2589
[4] Accelerating reaction-diffusion simulations with general-purpose graphics processing units
Vigelius, Matthias
Lane, Aidan
Meyer, Bernd
[J]. BIOINFORMATICS, 2011, 27 (02) : 288 - 290
[5] Continuous Gravitational-Wave Data Analysis with General Purpose Computing on Graphic Processing Units
La Rosa, Iuri
Astone, Pia
D'Antonio, Sabrina
Frasca, Sergio
Leaci, Paola
Miller, Andrew Lawrence
Palomba, Cristiano
Piccinni, Ornella Juliana
Pierini, Lorenzo
Regimbau, Tania
[J]. UNIVERSE, 2021, 7 (07)
[6] Acceleration of Electromagnetic Launchers Modeling by Using Graphic Processing Units
Musolino, Antonino
Rizzo, Rocco
Toni, Michele
Tripodi, Ernesto
[J]. 2012 16TH INTERNATIONAL SYMPOSIUM ON ELECTROMAGNETIC LAUNCH TECHNOLOGY (EML), 2012,
[7] A scalable algorithm for many-body dissipative particle dynamics using multiple general purpose graphic processing units
Di Giusto, Davide
Castagna, Jony
[J]. COMPUTER PHYSICS COMMUNICATIONS, 2022, 280
[8] Special issue: General-purpose processing using graphics processing units
Kaeli, David R.
Leeser, Miriam
[J]. JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING, 2008, 68 (10) : 1305 - 1306
[9] Accelerating sino-atrium computer simulations with graphic processing units
Zhang, Hong
Xiaoa, Zheng
Lin, Shien-Fong
[J]. BIO-MEDICAL MATERIALS AND ENGINEERING, 2015, 26 : S739 - S746
[10] Using efficient parallelization in Graphic Processing Units to parameterize stochastic fire propagation models
Denham, Monica
Laneri, Karina
[J]. JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 76 - 88

← 1 2 3 4 5 →