Pontryagin's Minimum Principle-Guided RL for Minimum-Time Exploration of Spatiotemporal Fields

被引:0
|
作者
Li, Zhuo [1 ,2 ]
Sun, Jian [1 ,2 ]
Marques, Antonio G. [3 ]
Wang, Gang [1 ,2 ]
You, Keyou [4 ,5 ]
机构
[1] Beijing Inst Technol, Sch Automat, Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100084, Peoples R China
[2] Beijing Inst Technol, Chongqing Innovat Ctr, Chongqing 401120, Peoples R China
[3] King Juan Carlos Univ, Dept Signal Theory & Commun, Madrid 28943, Spain
[4] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[5] Tsinghua Univ, BNRist, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Trajectory planning; Spatiotemporal phenomena; Trajectory; Planning; Observability; Optimization; Vehicle dynamics; Exploration of spatiotemporal field; functional constraint; minimum-time trajectory planning; Pontryagin's minimum principle (PMP); reinforcement learning (RL);
D O I
10.1109/TNNLS.2024.3379654
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article studies the trajectory planning problem of an autonomous vehicle for exploring a spatiotemporal field subject to a constraint on cumulative information. Since the resulting problem depends on the signal strength distribution of the field, which is unknown in practice, we advocate the use of a model-free reinforcement learning (RL) method to find the solution. Given the vehicle's dynamical model, a critical (and open) question is how to judiciously merge the model-based optimality conditions into the model-free RL framework for improved efficiency and generalization, for which this work provides some positive results. Specifically, we discretize the continuous action space by leveraging analytic optimality conditions for the minimum-time optimization problem via Pontryagin's minimum principle (PMP). This allows us to develop a novel discrete PMP-based RL trajectory planning algorithm, which learns a planning policy faster than those based on a continuous action space. Simulation results: 1) validate the effectiveness of the PMP-based RL algorithm and 2) demonstrate its advantages, in terms of both learning efficiency and the vehicle's exploration time, over two baseline methods for continuous control inputs.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [1] Informative Trajectory Planning Using Reinforcement Learning for Minimum-Time Exploration of Spatiotemporal Fields
    Li, Zhuo
    You, Keyou
    Sun, Jian
    Wang, Gang
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2023, 35 (12) : 1 - 11
  • [2] Q-learning and Pontryagin's Minimum Principle
    Mehta, Prashant
    Meyn, Sean
    [J]. PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, : 3598 - 3605
  • [3] Efficient MPC optimization using Pontryagin's minimum principle
    Cannon, Mark
    Liao, Weiheng
    Kouvaritakis, Basil
    [J]. INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL, 2008, 18 (08) : 831 - 844
  • [4] ECMS as a realization of Pontryagin's minimum principle for HEV control
    Serrao, Lorenzo
    Onori, Simona
    Rizzoni, Giorgio
    [J]. 2009 AMERICAN CONTROL CONFERENCE, VOLS 1-9, 2009, : 3964 - 3969
  • [5] Finite time formation control for multiple vehicles based on Pontryagin's minimum principle
    Geng Z.-Y.
    [J]. Geng, Zhi-Yong (zygeng@pku.edu.cn), 1600, Science Press (43): : 40 - 59
  • [6] A Costate Estimation for Pontryagin's Minimum Principle by Machine Learning
    Kang, Changbeom
    Song, Changhee
    Cha, Sukwon
    [J]. 2018 IEEE VEHICLE POWER AND PROPULSION CONFERENCE (VPPC), 2018,
  • [7] PONTRYAGIN'S MINIMUM PRINCIPLE FOR FUZZY OPTIMAL CONTROL PROBLEMS
    Farhadinia, B.
    [J]. IRANIAN JOURNAL OF FUZZY SYSTEMS, 2014, 11 (02): : 27 - 43
  • [8] Efficient MPC optimization using Pontryagin's Minimum Principle
    Cannon, Mark
    Liao, Weiheng
    Kouvaritakis, Basil
    [J]. PROCEEDINGS OF THE 45TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-14, 2006, : 5459 - 5464
  • [9] A time-optimal aircraft-following model based on Pontryagin’s minimum principle
    Linghang Meng
    Xiaohao Xu
    Zengxian Geng
    [J]. Journal of Modern Transportation, 2011, 19 (4): : 268 - 273
  • [10] A time-optimal aircraft-following model based on Pontryagin's minimum principle
    Linghang MENG1*
    [J]. Railway Engineering Science, 2011, (04) : 268 - 273