Deep reinforcement learning with robust augmented reward sequence prediction for improving GNSS positioning

被引：0

作者：

Tang, Jianhao ^{[1
,2
]}

Li, Zhenni ^{[1
,3
]}

Yu, Qingsong ^{[4
]}

Zhao, Haoli ^{[5
]}

Zeng, Kungan ^{[6
]}

Zhong, Shiguang ^{[4
]}

Wang, Qianming ^{[1
,7
]}

Xie, Kan ^{[1
,2
]}

Kuzin, Victor ^{[8
]}

Xie, Shengli ^{[2
,9
]}

机构：

[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China

[2] Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China

[3] Guangdong HongKong Macao Joint Lab Smart Discrete, Guangzhou 510006, Guangdong, Peoples R China

[4] Guangzhou Haige Commun Grp Inc Co, Guangzhou 510006, Peoples R China

[5] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China

[6] Sun Yat Sen Univ, Sch Software Engn, Zhuhai 519000, Peoples R China

[7] Taidou Microelect Technol Co Ltd, Guangzhou 510006, Peoples R China

[8] Acad Russian Engn Acad, Moscow, Russia

[9] Key Lab Intelligent Detect & Internet Things Mfg G, Guangzhou 510006, Peoples R China

来源：

GPS SOLUTIONS | 2025年 / 29卷 / 02期

基金：

中国国家自然科学基金;

关键词：

Deep reinforcement learning; Robust augmented reward sequence prediction; Generalization; GNSS positioning;

D O I：

10.1007/s10291-025-01824-w

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

Data-driven technologies have shown promising potential for improving GNSS positioning, which can analyze observation data to learn the complex hidden characteristics of system models, without rigorous prior assumptions. However, in complex urban areas, the input observation data contain task-irrelevant noisy GNSS measurements arising from stochastic noise, such as signal reflections from tall buildings. Moreover, the problem of data distribution shift between the training and testing phases exists for dynamically changing environments. These problems limit the robustness and generalizability of the data-driven GNSS positioning methods in urban areas. In this paper, a novel deep reinforcement learning (DRL) method is proposed to improve the robustness and generalizability of the data-driven GNSS positioning. Specifically, to address the data distribution shift in dynamically changing environments, the robust Bellman operator (RBO) is employed into the DRL optimization to model the deviations in the data distribution and to enhance generalizability. To improve robustness against task-irrelevant noisy GNSS measurements, the long-term reward sequence prediction (LRSP) is adopted to learn robust representations by extracting task-relevant information from GNSS observations. Therefore, we develop a DRL method with robust augmented reward sequence prediction to correct the rough position solved by model-based methods. Moreover, a novel real-world GNSS positioning dataset is built, containing different scenes in urban areas. Our experiments were conducted on the public dataset Google smartphone decimeter challenge 2022 (GSDC2022) and the built dataset Guangzhou GNSS version 2 (GZGNSS-V2), which demonstrated that the proposed method can outperform model-based and state-of-the-art data-driven methods in terms of generalizability across different environments.

引用

页数：28

共 50 条

[1] Improving GNSS Positioning Correction Using Deep Reinforcement Learning with an Adaptive Reward Augmentation Method
Tang, Jianhao
Li, Zhenni
Hou, Kexian
Li, Peili
Zhao, Haoli
Wang, Qianming
Liu, Ming
Xie, Shengli
NAVIGATION-JOURNAL OF THE INSTITUTE OF NAVIGATION, 2024, 71 (04):
[2] Learning Robust Representation for Reinforcement Learning with Distractions by Reward Sequence Prediction
Zhou, Qi
Wang, Jie
Liu, Qiyuan
Kuang, Yufei
Zhou, Wengang
Li, Houqiang
UNCERTAINTY IN ARTIFICIAL INTELLIGENCE, 2023, 216 : 2551 - 2562
[3] Improving performances of GNSS positioning correction using multiview deep reinforcement learning with sparse representation
Zhao, Haoli
Li, Zhenni
Wang, Qianming
Xie, Kan
Xie, Shengli
Liu, Ming
Chen, Ci
GPS SOLUTIONS, 2024, 28 (03)
[4] Robust Average-Reward Reinforcement Learning
Wang, Yue
Velasquez, Alvaro
Atia, George
Prater-Bennette, Ashley
Zou, Shaofeng
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 719 - 803
[5] Robust Average-Reward Reinforcement Learning
Wang, Yue
Velasquez, Alvaro
Atia, George
Prater-Bennette, Ashley
Zou, Shaofeng
Journal of Artificial Intelligence Research, 2024, 80 : 719 - 803
[6] Prediction of Reward Functions for Deep Reinforcement Learning via Gaussian Process Regression
Lim, Jaehyun
Ha, Seungchul
Choi, Jongeun
IEEE-ASME TRANSACTIONS ON MECHATRONICS, 2020, 25 (04) : 1739 - 1746
[7] Skill Reward for Safe Deep Reinforcement Learning
Cheng, Jiangchang
Yu, Fumin
Zhang, Hongliang
Dai, Yinglong
UBIQUITOUS SECURITY, 2022, 1557 : 203 - 213
[8] Hindsight Reward Shaping in Deep Reinforcement Learning
de Villiers, Byron
Sabatta, Deon
2020 INTERNATIONAL SAUPEC/ROBMECH/PRASA CONFERENCE, 2020, : 653 - 659
[9] Positioning Performance in Deep Pit Mines using GNSS Augmented with Locata
Evans, Maria J.
Eagen, Sean Evans
PROCEEDINGS OF THE 2021 INTERNATIONAL TECHNICAL MEETING OF THE INSTITUTE OF NAVIGATION, 2021, : 295 - 306
[10] Dynamic Positioning using Deep Reinforcement Learning
Overeng, Simen Sem
Nguyen, Dong Trong
Hamre, Geir
OCEAN ENGINEERING, 2021, 235

← 1 2 3 4 5 →