Deep reinforcement learning with robust augmented reward sequence prediction for improving GNSS positioning

被引:0
|
作者
Tang, Jianhao [1 ,2 ]
Li, Zhenni [1 ,3 ]
Yu, Qingsong [4 ]
Zhao, Haoli [5 ]
Zeng, Kungan [6 ]
Zhong, Shiguang [4 ]
Wang, Qianming [1 ,7 ]
Xie, Kan [1 ,2 ]
Kuzin, Victor [8 ]
Xie, Shengli [2 ,9 ]
机构
[1] Guangdong Univ Technol, Sch Automat, Guangzhou 510006, Peoples R China
[2] Guangdong Key Lab IoT Informat Technol, Guangzhou 510006, Peoples R China
[3] Guangdong HongKong Macao Joint Lab Smart Discrete, Guangzhou 510006, Guangdong, Peoples R China
[4] Guangzhou Haige Commun Grp Inc Co, Guangzhou 510006, Peoples R China
[5] Hebei Univ Technol, Sch Artificial Intelligence, Tianjin 300401, Peoples R China
[6] Sun Yat Sen Univ, Sch Software Engn, Zhuhai 519000, Peoples R China
[7] Taidou Microelect Technol Co Ltd, Guangzhou 510006, Peoples R China
[8] Acad Russian Engn Acad, Moscow, Russia
[9] Key Lab Intelligent Detect & Internet Things Mfg G, Guangzhou 510006, Peoples R China
基金
中国国家自然科学基金;
关键词
Deep reinforcement learning; Robust augmented reward sequence prediction; Generalization; GNSS positioning;
D O I
10.1007/s10291-025-01824-w
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
Data-driven technologies have shown promising potential for improving GNSS positioning, which can analyze observation data to learn the complex hidden characteristics of system models, without rigorous prior assumptions. However, in complex urban areas, the input observation data contain task-irrelevant noisy GNSS measurements arising from stochastic noise, such as signal reflections from tall buildings. Moreover, the problem of data distribution shift between the training and testing phases exists for dynamically changing environments. These problems limit the robustness and generalizability of the data-driven GNSS positioning methods in urban areas. In this paper, a novel deep reinforcement learning (DRL) method is proposed to improve the robustness and generalizability of the data-driven GNSS positioning. Specifically, to address the data distribution shift in dynamically changing environments, the robust Bellman operator (RBO) is employed into the DRL optimization to model the deviations in the data distribution and to enhance generalizability. To improve robustness against task-irrelevant noisy GNSS measurements, the long-term reward sequence prediction (LRSP) is adopted to learn robust representations by extracting task-relevant information from GNSS observations. Therefore, we develop a DRL method with robust augmented reward sequence prediction to correct the rough position solved by model-based methods. Moreover, a novel real-world GNSS positioning dataset is built, containing different scenes in urban areas. Our experiments were conducted on the public dataset Google smartphone decimeter challenge 2022 (GSDC2022) and the built dataset Guangzhou GNSS version 2 (GZGNSS-V2), which demonstrated that the proposed method can outperform model-based and state-of-the-art data-driven methods in terms of generalizability across different environments.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Generalization in Deep Reinforcement Learning for Robotic Navigation by Reward Shaping
    Miranda, Victor R. F.
    Neto, Armando A.
    Freitas, Gustavo M.
    Mozelli, Leonardo A.
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2024, 71 (06) : 6013 - 6020
  • [32] Improving Deep Reinforcement Learning via Transfer
    Du, Yunshu
    AAMAS '19: PROCEEDINGS OF THE 18TH INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS AND MULTIAGENT SYSTEMS, 2019, : 2405 - 2407
  • [33] Dopamine encodes a quantiative reward prediction error for for reinforcement learning
    Glimcher, PW
    Mullette-Gillman, OA
    Bayer, HM
    Lau, B
    Rutledge, R
    NEUROPSYCHOPHARMACOLOGY, 2005, 30 : S27 - S27
  • [34] Learning positioning policies for mobile manipulation operations with deep reinforcement learning
    Iriondo, Ander
    Lazkano, Elena
    Ansuategi, Ander
    Rivera, Andoni
    Lluvia, Iker
    Tubio, Carlos
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3003 - 3023
  • [35] Learning positioning policies for mobile manipulation operations with deep reinforcement learning
    Ander Iriondo
    Elena Lazkano
    Ander Ansuategi
    Andoni Rivera
    Iker Lluvia
    Carlos Tubío
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3003 - 3023
  • [36] Inferring reward prediction errors in patients with schizophrenia: a dynamic reward task for reinforcement learning
    Li, Chia-Tzu
    Lai, Wen-Sung
    Liu, Chih-Min
    Hsu, Yung-Fong
    FRONTIERS IN PSYCHOLOGY, 2014, 5
  • [37] Robust Reinforcement Learning via Progressive Task Sequence
    Li, Yike
    Tian, Yunzhe
    Tong, Endong
    Niu, Wenjia
    Liu, Jiqiang
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 455 - 463
  • [38] Robust Reward-Free ActorCritic for Cooperative Multiagent Reinforcement Learning
    Lin, Qifeng
    Ling, Qing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (12) : 17318 - 17329
  • [39] Reward-based participant selection for improving federated reinforcement learning
    Lee, Woonghee
    ICT EXPRESS, 2023, 9 (05): : 803 - 808
  • [40] Diversity-augmented intrinsic motivation for deep reinforcement learning
    Dai, Tianhong
    Du, Yali
    Fang, Meng
    Bharath, Anil Anthony
    NEUROCOMPUTING, 2022, 468 : 396 - 406