Reachability-Based Trajectory Safeguard (RTS): A Safe and Fast Reinforcement Learning Safety Layer for Continuous Control

被引:23
|
作者
Shao, Yifei Simon [1 ,2 ]
Chen, Chao [2 ]
Kousik, Shreyas [3 ]
Vasudevan, Ram [1 ,2 ]
机构
[1] Univ Michigan, Sch Mech Engn, Ann Arbor, MI 48109 USA
[2] Univ Michigan, Robot Inst, Ann Arbor, MI 48109 USA
[3] Stanford Univ, Dept Aeronaut & Astronaut, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Reinforcement learning; robot safety; task and motion planning;
D O I
10.1109/LRA.2021.3063989
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Reinforcement Learning (RL) algorithms have achieved remarkable performance in decision making and control tasks by reasoning about long-term, cumulative reward using trial and error. However, during RL training, applying this trial-and-error approach to real-world robots operating in safety critical environment may lead to collisions. To address this challenge, this letter proposes a Reachability-based Trajectory Safeguard (RTS), which leverages reachability analysis to ensure safety during training and operation. Given a known (but uncertain) model of a robot, RTS precomputes a Forward Reachable Set of the robot tracking a continuum of parameterized trajectories. At runtime, the RL agent selects from this continuum in a receding-horizon way to control the robot; the FRS is used to identify if the agent's choice is safe or not, and to adjust unsafe choices. The efficacy of this method is illustrated in static environments on three nonlinear robot models, including a 12-D quadrotor drone, in simulation and in comparison with state-of-the-art safe motion planning methods.
引用
收藏
页码:3663 / 3670
页数:8
相关论文
共 50 条
  • [31] Trajectory Tracking Control of Intelligent Vehicle Based on DDPG Method of Reinforcement Learning
    He, Yi-Lin
    Song, Ruo-Yang
    Ma, Jian
    [J]. Zhongguo Gonglu Xuebao/China Journal of Highway and Transport, 2021, 34 (11): : 335 - 348
  • [32] An adaptive safety layer with hard constraints for safe reinforcement learning in multi-energy management systems
    Ceusters, Glenn
    Putratama, Muhammad Andy
    Franke, Ruediger
    Nowe, Ann
    Messagie, Maarten
    [J]. SUSTAINABLE ENERGY GRIDS & NETWORKS, 2023, 36
  • [33] Design and Experimental Validation of Deep Reinforcement Learning-Based Fast Trajectory Planning and Control for Mobile Robot in Unknown Environment
    Chai, Runqi
    Niu, Hanlin
    Carrasco, Joaquin
    Arvin, Farshad
    Yin, Hujun
    Lennox, Barry
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (04) : 5778 - 5792
  • [34] Safe, efficient, and comfortable velocity control based on reinforcement learning for autonomous driving
    Zhu, Meixin
    Wang, Yinhai
    Pu, Ziyuan
    Hu, Jingyun
    Wang, Xuesong
    Ke, Ruimin
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2020, 117
  • [35] Tracking interval control for urban rail trains based on safe reinforcement learning
    Lin, Junting
    Qiu, Xiaohui
    Li, Maolin
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 137
  • [36] Safe Transfer-Reinforcement-Learning-Based Optimal Control of Nonlinear Systems
    Wang, Yujia
    Xiao, Ming
    Wu, Zhe
    [J]. IEEE Transactions on Cybernetics, 2024, 54 (12) : 7272 - 7284
  • [37] A Fast Converged Voltage Control Method based on Deep Reinforcement Learning
    Wang, Xinqiao
    Liu, Siyan
    Wang, Bo
    [J]. 2021 POWER SYSTEM AND GREEN ENERGY CONFERENCE (PSGEC), 2021, : 12 - 17
  • [38] Model-based Reinforcement Learning for Continuous Control with Posterior Sampling
    Fan, Ying
    Ming, Yifei
    [J]. INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [39] HDPG: Hyperdimensional Policy-based Reinforcement Learning for Continuous Control
    Ni, Yang
    Issa, Mariam
    Abraham, Danny
    Imani, Mandi
    Yin, Xunzhao
    Imani, Mohsen
    [J]. PROCEEDINGS OF THE 59TH ACM/IEEE DESIGN AUTOMATION CONFERENCE, DAC 2022, 2022, : 1141 - 1146
  • [40] Deep Reinforcement Learning-based Continuous Control for Multicopter Systems
    Manukyan, Anush
    Olivares-Mendez, Miguel A.
    Geist, Maifflieu
    Voos, Holger
    [J]. 2019 6TH INTERNATIONAL CONFERENCE ON CONTROL, DECISION AND INFORMATION TECHNOLOGIES (CODIT 2019), 2019, : 1876 - 1881