A Safe and Self-Recoverable Reinforcement Learning Framework for Autonomous Robots

被引：0

作者：

Wang, Weiqiang ^{[1
,2
]}

Zhou, Xu ^{[2
]}

Xu, Benlian ^{[2
]}

Lu, Mingli ^{[2
]}

Zhang, Yuxin ^{[2
]}

Gu, Yuhang ^{[2
]}

机构：

[1] Yancheng Inst Technol, Yancheng 224000, Peoples R China

[2] Changshu Inst Technol, Suzhou 215500, Peoples R China

来源：

2022 41ST CHINESE CONTROL CONFERENCE (CCC) | 2022年

基金：

中国国家自然科学基金;

关键词：

safe reinforcement learning; self-recoverable reinforcement learning; autonomous robots;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) holds the promise of autonomous robots because it can adapt to dynamic or unknown environments by automatically learning optimal control policies from the interactions between robots and environments. However, the interactions can be unsafe to both robots and environments during the learning phase, which hinders the practical deployment of RL. Some safe RL methods have been proposed to improve the learning safety by using external or prior knowledge to guide safe actions, but it is difficult to assume having this knowledge in practical applications, especially in unknown environments. More importantly, considering failures are unavoidable in practice, current safe RL lacks the capability of recovering to safe states from failures so that the learning cannot be continued and finished. To solve these problems, we propose a safe and self-recoverable reinforcement learning framework that can predict and prohibit other unsafe actions based on known, explored unsafe actions during the exploration process, and can self-recover to a safe state when a failure occurs. The maze navigation simulation results show that our approach can not only significantly reduce the number of failures but also accelerate the convergence of reinforcement learning.

引用

页码：3878 / 3883

页数：6

共 50 条

[31] Safe reinforcement learning for high-speed autonomous racing
Evans B.D.
Jordaan H.W.
Engelbrecht H.A.
[J]. Cognitive Robotics, 2023, 3 : 107 - 126
[32] Incremental Learning for Autonomous Navigation of Mobile Robots based on Deep Reinforcement Learning
Manh Luong
Cuong Pham
[J]. JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2021, 101 (01)
[33] Learning to bag with a simulation-free reinforcement learning framework for robots
Munguia-Galeano, Francisco
Zhu, Jihong
Hernandez, Juan David
Ji, Ze
[J]. IET CYBER-SYSTEMS AND ROBOTICS, 2024, 6 (02)
[34] How FPGAs Can Help Create Self-Recoverable Antenna Arrays
Joler, Miroslav
[J]. INTERNATIONAL JOURNAL OF ANTENNAS AND PROPAGATION, 2012, 2012
[35] A Low Overhead and Double-Node-Upset Self-Recoverable Latch
Yan, Aibin
Xia, Fan
Ni, Tianming
Cui, Jie
Huang, Zhengfeng
Girard, Patrick
Wen, Xiaoqing
[J]. 2023 IEEE INTERNATIONAL TEST CONFERENCE IN ASIA, ITC-ASIA, 2023,
[36] Janus Microdroplets with Tunable Self-Recoverable and Switchable Reflective Structural Colors
Liu, Mingzhu
Fu, Jiemin
Yang, Shengsong
Wang, Yuchen
Jin, Lishuai
Nah, So Hee
Gao, Yuchong
Ning, Yifan
Murray, Christopher B.
Yang, Shu
[J]. ADVANCED MATERIALS, 2023, 35 (05)
[37] Application of Reinforcement Learning to Autonomous Heading Control for Bionic Underwater Robots
Lin, Longxin
Xie, Haibin
Shen, Lincheng
[J]. 2009 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO 2009), VOLS 1-4, 2009, : 2486 - 2490
[38] Route Planning for Autonomous Mobile Robots Using a Reinforcement Learning Algorithm
Talaat, Fatma M. M.
Ibrahim, Abdelhameed
El-Kenawy, El-Sayed M.
Abdelhamid, Abdelaziz M. A.
Alhussan, Amel Ali
Khafaga, Doaa Sami
Salem, Dina Ahmed
[J]. ACTUATORS, 2023, 12 (01)
[39] Controlling Fleets of Autonomous Mobile Robots with Reinforcement Learning: A Brief Survey
Wesselhoft, Mike
Hinckeldeyn, Johannes
Kreutzfeldt, Jochen
[J]. ROBOTICS, 2022, 11 (05)
[40] Reinforcement learning for autonomous mobile robots by forming approximate classificatory concepts
Sawaragi, T
Sawada, H
Katai, O
[J]. IROS 96 - PROCEEDINGS OF THE 1996 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS - ROBOTIC INTELLIGENCE INTERACTING WITH DYNAMIC WORLDS, VOLS 1-3, 1996, : 1337 - 1344

← 1 2 3 4 5 →