Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引：0

作者：

Wu, Jiaxu ^{[1
]}

Wang, Yusheng ^{[1
]}

Asama, Hajime ^{[1
]}

An, Qi ^{[2
]}

Yamashita, Atsushi ^{[2
]}

机构：

[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan

[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan

来源：

2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS) | 2023年

关键词：

COLLISION-AVOIDANCE;

D O I：

10.1109/IROS55552.2023.10341948

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.

引用

页码：7456 / 7462

页数：7

共 50 条

[31] Mobile Robot Navigation based on Deep Reinforcement Learning
Ruan, Xiaogang
Ren, Dingqi
Zhu, Xiaoqing
Huang, Jing
PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 6174 - 6178
[32] Reinforcement Learning Based Approach For Mobile Robot Navigation
Jaseem, Mohammed M.
Mathew, Robins
Hiremath, Somashekhar S.
PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 524 - 527
[33] Reinforcement learning-based mobile robot navigation
Altuntas, Nihal
Imal, Erkan
Emanet, Nahit
Ozturk, Ceyda Nur
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (03) : 1747 - 1767
[34] RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
Qiu, Wei
Wang, Xinrun
Yu, Runsheng
He, Xu
Wang, Rundong
An, Bo
Obraztsova, Svetlana
Rabinovich, Zinovi
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[35] Risk-sensitive reinforcement learning algorithms with generalized average criterion
殷苌茗
王汉兴
赵飞
Applied Mathematics and Mechanics(English Edition), 2007, (03) : 405 - 416
[36] State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning
Ma, Shuai
Yu, Jia Yuan
THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4512 - 4519
[37] Risk-sensitive reinforcement learning algorithms with generalized average criterion
Chang-ming Yin
Wang Han-xing
Zhao Fei
Applied Mathematics and Mechanics, 2007, 28 : 405 - 416
[38] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
Mazumdar, Eric
Ratliff, Lillian J.
Fiez, Tanner
Sastry, S. Shankar
2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
[39] Risk-sensitive reinforcement learning applied to control under constraints
Geibel, P
Wysotzki, F
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 81 - 108
[40] Risk-sensitive reinforcement learning applied to control under constraints
Geibel, P. (PGEIBEL@UOS.DE), 1600, American Association for Artificial Intelligence (24):

← 1 2 3 4 5 →