A Safe and Self-Recoverable Reinforcement Learning Framework for Autonomous Robots

被引：0

作者：

Wang, Weiqiang ^{[1
,2
]}

Zhou, Xu ^{[2
]}

Xu, Benlian ^{[2
]}

Lu, Mingli ^{[2
]}

Zhang, Yuxin ^{[2
]}

Gu, Yuhang ^{[2
]}

机构：

[1] Yancheng Inst Technol, Yancheng 224000, Peoples R China

[2] Changshu Inst Technol, Suzhou 215500, Peoples R China

来源：

2022 41ST CHINESE CONTROL CONFERENCE (CCC) | 2022年

基金：

中国国家自然科学基金;

关键词：

safe reinforcement learning; self-recoverable reinforcement learning; autonomous robots;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Reinforcement learning (RL) holds the promise of autonomous robots because it can adapt to dynamic or unknown environments by automatically learning optimal control policies from the interactions between robots and environments. However, the interactions can be unsafe to both robots and environments during the learning phase, which hinders the practical deployment of RL. Some safe RL methods have been proposed to improve the learning safety by using external or prior knowledge to guide safe actions, but it is difficult to assume having this knowledge in practical applications, especially in unknown environments. More importantly, considering failures are unavoidable in practice, current safe RL lacks the capability of recovering to safe states from failures so that the learning cannot be continued and finished. To solve these problems, we propose a safe and self-recoverable reinforcement learning framework that can predict and prohibit other unsafe actions based on known, explored unsafe actions during the exploration process, and can self-recover to a safe state when a failure occurs. The maze navigation simulation results show that our approach can not only significantly reduce the number of failures but also accelerate the convergence of reinforcement learning.

引用

页码：3878 / 3883

页数：6

共 50 条

[1] Self-recoverable antenna arrays
Joler, M.
[J]. IET MICROWAVES ANTENNAS & PROPAGATION, 2012, 6 (14) : 1608 - 1615
[2] Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots
Zhang, Ruixian
Han, Yining
Su, Man
Lin, Zefeng
Li, Haowei
Zhang, Lixian
[J]. SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (04) : 1023 - 1039
[3] Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots
Zhang, RuiXian
Han, YiNing
Su, Man
Lin, ZeFeng
Li, HaoWei
Zhang, LiXian
[J]. SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (01) : 172 - 182
[4] Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots
RuiXian Zhang
YiNing Han
Man Su
ZeFeng Lin
HaoWei Li
LiXian Zhang
[J]. Science China Technological Sciences, 2024, 67 : 172 - 182
[5] A safe reinforcement learning approach for autonomous navigation of mobile robots in dynamic environments
Zhou, Zhiqian
Ren, Junkai
Zeng, Zhiwen
Xiao, Junhao
Zhang, Xinglong
Guo, Xian
Zhou, Zongtan
Lu, Huimin
[J]. CAAI TRANSACTIONS ON INTELLIGENCE TECHNOLOGY, 2023,
[6] Robust reinforcement learning with UUB guarantee for safe motion control of autonomous robots
ZHANG RuiXian
HAN YiNing
SU Man
LIN ZeFeng
LI HaoWei
ZHANG LiXian
[J]. Science China Technological Sciences, 2024, (01) : 172 - 182
[7] WISEMOVE: A Framework to Investigate Safe Deep Reinforcement Learning for Autonomous Driving
Lee, Jaeyoung
Balakrishnan, Aravind
Gaurav, Ashish
Czarnecki, Krzysztof
Sedwards, Sean
[J]. QUANTITATIVE EVALUATION OF SYSTEMS (QEST 2019), 2019, 11785 : 350 - 354
[8] On the Development of a Self-Recoverable Antenna System
Joler, Miroslav
Christodoulou, Christos G.
[J]. 2009 IEEE ANTENNAS AND PROPAGATION SOCIETY INTERNATIONAL SYMPOSIUM AND USNC/URSI NATIONAL RADIO SCIENCE MEETING, VOLS 1-6, 2009, : 277 - +
[9] Safe Reinforcement Learning on Autonomous Vehicles
Isele, David
Nakhaei, Alireza
Fujimura, Kikuo
[J]. 2018 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2018, : 6162 - 6167
[10] Constructing self-recoverable finite state protocols
Park, JC
Miller, RE
[J]. 1998 IEEE INTERNATIONAL PERFORMANCE, COMPUTING AND COMMUNICATIONS CONFERENCE, 1997, : 419 - 425

← 1 2 3 4 5 →