Informative Trajectory Planning Using Reinforcement Learning for Minimum-Time Exploration of Spatiotemporal Fields

被引:1
|
作者
Li, Zhuo [1 ,2 ]
You, Keyou [3 ,4 ]
Sun, Jian [1 ,2 ]
Wang, Gang [1 ,2 ]
机构
[1] Beijing Inst Technol, Sch Automat, Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100081, Peoples R China
[2] Beijing Inst Technol, Chongqing Innovat Ctr, Chongqing 401120, Peoples R China
[3] Tsinghua Univ, Dept Automat, Beijing 100084, Peoples R China
[4] Tsinghua Univ, BNRist, Beijing 100084, Peoples R China
基金
中国国家自然科学基金;
关键词
Autonomous vehicles; optimal control; reinforcement learning (RL); trajectory optimization;
D O I
10.1109/TNNLS.2023.3300926
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article studies the informative trajectory planning problem of an autonomous vehicle for field exploration. In contrast to existing works concerned with maximizing the amount of information about spatial fields, this work considers efficient exploration of spatiotemporal fields with unknown distributions and seeks minimum-time trajectories of the vehicle while respecting a cumulative information constraint. In this work, upon adopting the observability constant as an information measure for expressing the cumulative information constraint, the existence of a minimum-time trajectory is proven under mild conditions. Given the spatiotemporal nature, the problem is modeled as a Markov decision process (MDP), for which a reinforcement learning (RL) algorithm is proposed to learn a continuous planning policy. To accelerate the policy learning, we design a new reward function by leveraging field approximations, which is demonstrated to yield dense rewards. Simulations show that the learned policy can steer the vehicle to achieve an efficient exploration, and it outperforms the commonly-used coverage planning method in terms of exploration time for sufficient cumulative information.
引用
收藏
页码:1 / 11
页数:11
相关论文
共 50 条
  • [1] Continuous advantage learning for minimum-time trajectory planning of autonomous vehicles
    Zhuo LI
    Weiran WU
    Jialin WANG
    Gang WANG
    Jian SUN
    [J]. Science China(Information Sciences), 2024, (07) - 294
  • [2] Pontryagin's Minimum Principle-Guided RL for Minimum-Time Exploration of Spatiotemporal Fields
    Li, Zhuo
    Sun, Jian
    Marques, Antonio G.
    Wang, Gang
    You, Keyou
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024,
  • [3] Continuous advantage learning for minimum-time trajectory planning of autonomous vehicles
    Li, Zhuo
    Wu, Weiran
    Wang, Jialin
    Wang, Gang
    Sun, Jian
    [J]. SCIENCE CHINA-INFORMATION SCIENCES, 2024, 67 (07)
  • [4] Continuous advantage learning for minimum-time trajectory planning of autonomous vehicles
    Zhuo LI
    Weiran WU
    Jialin WANG
    Gang WANG
    Jian SUN
    [J]. Science China(Information Sciences)., 2024, 67 (07) - 294
  • [5] KINEMATIC MINIMUM-TIME TRAJECTORY PLANNING FOR A MANIPULATOR
    叶桦
    冯纯伯
    [J]. Journal of Southeast University(English Edition), 1991, (01) : 85 - 90
  • [6] Minimum-Time Trajectory Planning Under Intermittent Measurements
    Penin, Bryan
    Giordano, Paolo Robuffo
    Chautnette, Francois
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (01) : 153 - 160
  • [7] A MINIMUM-TIME TRAJECTORY PLANNING METHOD FOR 2 ROBOTS
    BIEN, ZN
    LEE, JH
    [J]. IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION, 1992, 8 (03): : 414 - 418
  • [8] Minimum-Time Trajectory Planning for Helicopter UAVs using Computational Dynamic Optimization
    Xu, Nathan
    Cai, Guowei
    Kang, Wei
    Chen, Ben M.
    [J]. PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 2732 - 2737
  • [9] Global minimum-time trajectory planning of mechanical manipulators using interval analysis
    Piazzi, A
    Visioli, A
    [J]. INTERNATIONAL JOURNAL OF CONTROL, 1998, 71 (04) : 631 - 652
  • [10] On-line minimum-time trajectory planning for industrial manipulators
    Kim, Joon-Young
    Kim, Dong-Hyeok
    Kim, Sung-Rak
    [J]. 2007 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS, VOLS 1-6, 2007, : 1789 - 1793