Efficient computation of the geopotential gradient in graphic processing units

被引：0

作者：

Rubio, Carlos ^{[1
]}

Gonzalo, Jesus ^{[1
]}

Siminski, Jan ^{[2
]}

Escapa, Alberto ^{[1
]}

机构：

[1] Univ Leon, Dept Aerosp Engn, Leon, Spain

[2] European Space Operat Ctr ESA ESOC, European Space Agcy, Darmstadt, Germany

来源：

ADVANCES IN SPACE RESEARCH | 2024年 / 74卷 / 01期

关键词：

Geopotential gradient computation; LEO orbit propagation; Cunningham formulation; GPU parallelization; CUDA; ORBIT DETERMINATION; HARMONIC SYNTHESIS; MODELS; GPS;

D O I：

10.1016/j.asr.2024.04.056

中图分类号：

V [航空、航天];

学科分类号：

08 ; 0825 ;

摘要：

Efficient computation of the geopotential gradient is essential for numerical propagators, particularly in scenarios involving low Earth orbits. Conventional geopotential calculations are based on spherical harmonics series, which become computationally demanding as the degree/order increases. This computational burden can be mitigated by means of parallelized algorithms. Additionally, certain situations lend themselves to high parallelization, such as the propagation of space debris catalogs, satellite mega -constellations, or the dispersion of particles resulting from a space collision event. This paper introduces an optimized Graphics Processing Unit (GPU) implementation designed to facilitate extensive parallelization in the geopotential gradient calculation. The formulation developed in this study is not specific to any GPU. However, to illustrate the low-level optimizations necessary for an efficient implementation, we selected the Compute Unified Device Architecture (CUDA) as the dominant and de facto standard in parallel computing. Nevertheless, most of the concepts and optimizations presented in this paper are also valid for other GPU architectures. Built upon the spherical harmonic expansion using the Cunningham formulation, which is well -suited for GPU computations, our implementation offers several variants with different tradeoffs between speed and accuracy. Besides GPU double precision, we introduced a mixed precision arithmetic -a hybrid between single and double precision- that exploits GPU capabilities with a low penalty in accuracy. The proposed algorithm was implemented as a software reusable module, and its performance was evaluated against GMAT, GODOT, and Orekit astrodynamic codes. The algorithm's accuracy in double precision is comparable to such codes. The mixed precision version showed enough accuracy for LEO satellite propagation, with around 1 m difference in four days. Testing across different CUDA architectures revealed very high speed-up factors compared to a single CPU, reaching a speed-up of 645 for the mixed precision variant and 450 for the double precision one in the propagation of about 3200 objects with a geopotential of degree/order 126 x 126 using an A100 GPU device. (c) 2024 COSPAR. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

引用

页码：332 / 347

页数：16

共 50 条

[1] Parallel data cube computation on graphic processing units
Zhou G.-L.
Chen H.
Li C.-P.
Wang S.
Zheng T.
Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (10): : 1788 - 1798
[2] Enhancing the Performance of Conjugate Gradient Solvers on Graphic Processing Units
Dehnavi, Maryam Mehri
Fernandez, David M.
Giannacopoulos, Dennis
IEEE TRANSACTIONS ON MAGNETICS, 2011, 47 (05) : 1162 - 1165
[3] Efficient Implementation of Total FETI Solver for Graphic Processing Units Using Schur Complement
Riha, Lubomir
Brzobohaty, Tomas
Markopoulos, Alexandros
Kozubek, Tomas
Meca, Ondrej
Schenk, Olaf
Vanroose, Wim
HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING, HPCSE 2015, 2016, 9611 : 85 - 100
[4] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
XIONG QinGang LI Bo XU Ji FANG XiaoJian WANG XiaoWei WANG LiMin HE XianFeng GE Wei State Key Laboratory of Multiphase Complex Systems Institute of Process Engineering Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China
Chinese Science Bulletin, 2012, 57 (07) : 707 - 715
[5] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
XIONG QinGang1
2 Graduate University of Chinese Academy of Sciences
Science Bulletin, 2012, (07) : 707 - 715
[6] Highly Efficient Implementation of Block Ciphers on Graphic Processing Units for Massively Large Data
An, SangWoo
Seo, Seog Chung
APPLIED SCIENCES-BASEL, 2020, 10 (11):
[7] Using efficient parallelization in Graphic Processing Units to parameterize stochastic fire propagation models
Denham, Monica
Laneri, Karina
JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 76 - 88
[8] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
Xiong QinGang
Li Bo
Xu Ji
Fang XiaoJian
Wang XiaoWei
Wang LiMin
He XianFeng
Ge Wei
CHINESE SCIENCE BULLETIN, 2012, 57 (07): : 707 - 715
[9] Using graphic processing units for tracking algorithms
Gonzalez-Mora, Jose
Guil, Nicolas
Zapata, Emilio L.
INFORMATION OPTICS, 2006, 860 : 310 - +
[10] Neutron Radiation Test of Graphic Processing Units
Rech, P.
Aguiar, C.
Ferreira, R.
Frost, C.
Carro, L.
2012 IEEE 18TH INTERNATIONAL ON-LINE TESTING SYMPOSIUM (IOLTS), 2012, : 55 - 60

← 1 2 3 4 5 →