Efficient computation of the geopotential gradient in graphic processing units

被引:0
|
作者
Rubio, Carlos [1 ]
Gonzalo, Jesus [1 ]
Siminski, Jan [2 ]
Escapa, Alberto [1 ]
机构
[1] Univ Leon, Dept Aerosp Engn, Leon, Spain
[2] European Space Operat Ctr ESA ESOC, European Space Agcy, Darmstadt, Germany
关键词
Geopotential gradient computation; LEO orbit propagation; Cunningham formulation; GPU parallelization; CUDA; ORBIT DETERMINATION; HARMONIC SYNTHESIS; MODELS; GPS;
D O I
10.1016/j.asr.2024.04.056
中图分类号
V [航空、航天];
学科分类号
08 ; 0825 ;
摘要
Efficient computation of the geopotential gradient is essential for numerical propagators, particularly in scenarios involving low Earth orbits. Conventional geopotential calculations are based on spherical harmonics series, which become computationally demanding as the degree/order increases. This computational burden can be mitigated by means of parallelized algorithms. Additionally, certain situations lend themselves to high parallelization, such as the propagation of space debris catalogs, satellite mega -constellations, or the dispersion of particles resulting from a space collision event. This paper introduces an optimized Graphics Processing Unit (GPU) implementation designed to facilitate extensive parallelization in the geopotential gradient calculation. The formulation developed in this study is not specific to any GPU. However, to illustrate the low-level optimizations necessary for an efficient implementation, we selected the Compute Unified Device Architecture (CUDA) as the dominant and de facto standard in parallel computing. Nevertheless, most of the concepts and optimizations presented in this paper are also valid for other GPU architectures. Built upon the spherical harmonic expansion using the Cunningham formulation, which is well -suited for GPU computations, our implementation offers several variants with different tradeoffs between speed and accuracy. Besides GPU double precision, we introduced a mixed precision arithmetic -a hybrid between single and double precision- that exploits GPU capabilities with a low penalty in accuracy. The proposed algorithm was implemented as a software reusable module, and its performance was evaluated against GMAT, GODOT, and Orekit astrodynamic codes. The algorithm's accuracy in double precision is comparable to such codes. The mixed precision version showed enough accuracy for LEO satellite propagation, with around 1 m difference in four days. Testing across different CUDA architectures revealed very high speed-up factors compared to a single CPU, reaching a speed-up of 645 for the mixed precision variant and 450 for the double precision one in the propagation of about 3200 objects with a geopotential of degree/order 126 x 126 using an A100 GPU device. (c) 2024 COSPAR. Published by Elsevier B.V. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
引用
收藏
页码:332 / 347
页数:16
相关论文
共 50 条
  • [1] Parallel data cube computation on graphic processing units
    Zhou G.-L.
    Chen H.
    Li C.-P.
    Wang S.
    Zheng T.
    Jisuanji Xuebao/Chinese Journal of Computers, 2010, 33 (10): : 1788 - 1798
  • [2] Enhancing the Performance of Conjugate Gradient Solvers on Graphic Processing Units
    Dehnavi, Maryam Mehri
    Fernandez, David M.
    Giannacopoulos, Dennis
    IEEE TRANSACTIONS ON MAGNETICS, 2011, 47 (05) : 1162 - 1165
  • [3] Efficient Implementation of Total FETI Solver for Graphic Processing Units Using Schur Complement
    Riha, Lubomir
    Brzobohaty, Tomas
    Markopoulos, Alexandros
    Kozubek, Tomas
    Meca, Ondrej
    Schenk, Olaf
    Vanroose, Wim
    HIGH PERFORMANCE COMPUTING IN SCIENCE AND ENGINEERING, HPCSE 2015, 2016, 9611 : 85 - 100
  • [4] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    XIONG QinGang LI Bo XU Ji FANG XiaoJian WANG XiaoWei WANG LiMin HE XianFeng GE Wei State Key Laboratory of Multiphase Complex Systems Institute of Process Engineering Chinese Academy of Sciences Beijing China Graduate University of Chinese Academy of Sciences Beijing China
    Chinese Science Bulletin, 2012, 57 (07) : 707 - 715
  • [5] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    XIONG QinGang1
    2 Graduate University of Chinese Academy of Sciences
    Science Bulletin, 2012, (07) : 707 - 715
  • [6] Highly Efficient Implementation of Block Ciphers on Graphic Processing Units for Massively Large Data
    An, SangWoo
    Seo, Seog Chung
    APPLIED SCIENCES-BASEL, 2020, 10 (11):
  • [7] Using efficient parallelization in Graphic Processing Units to parameterize stochastic fire propagation models
    Denham, Monica
    Laneri, Karina
    JOURNAL OF COMPUTATIONAL SCIENCE, 2018, 25 : 76 - 88
  • [8] Efficient parallel implementation of the lattice Boltzmann method on large clusters of graphic processing units
    Xiong QinGang
    Li Bo
    Xu Ji
    Fang XiaoJian
    Wang XiaoWei
    Wang LiMin
    He XianFeng
    Ge Wei
    CHINESE SCIENCE BULLETIN, 2012, 57 (07): : 707 - 715
  • [9] Using graphic processing units for tracking algorithms
    Gonzalez-Mora, Jose
    Guil, Nicolas
    Zapata, Emilio L.
    INFORMATION OPTICS, 2006, 860 : 310 - +
  • [10] Neutron Radiation Test of Graphic Processing Units
    Rech, P.
    Aguiar, C.
    Ferreira, R.
    Frost, C.
    Carro, L.
    2012 IEEE 18TH INTERNATIONAL ON-LINE TESTING SYMPOSIUM (IOLTS), 2012, : 55 - 60