Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引:0
|
作者
Wu, Jiaxu [1 ]
Wang, Yusheng [1 ]
Asama, Hajime [1 ]
An, Qi [2 ]
Yamashita, Atsushi [2 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan
[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan
关键词
COLLISION-AVOIDANCE;
D O I
10.1109/IROS55552.2023.10341948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.
引用
收藏
页码:7456 / 7462
页数:7
相关论文
共 50 条
  • [31] Mobile Robot Navigation based on Deep Reinforcement Learning
    Ruan, Xiaogang
    Ren, Dingqi
    Zhu, Xiaoqing
    Huang, Jing
    PROCEEDINGS OF THE 2019 31ST CHINESE CONTROL AND DECISION CONFERENCE (CCDC 2019), 2019, : 6174 - 6178
  • [32] Reinforcement Learning Based Approach For Mobile Robot Navigation
    Jaseem, Mohammed M.
    Mathew, Robins
    Hiremath, Somashekhar S.
    PROCEEDINGS OF 2019 INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND KNOWLEDGE ECONOMY (ICCIKE' 2019), 2019, : 524 - 527
  • [33] Reinforcement learning-based mobile robot navigation
    Altuntas, Nihal
    Imal, Erkan
    Emanet, Nahit
    Ozturk, Ceyda Nur
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (03) : 1747 - 1767
  • [34] RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents
    Qiu, Wei
    Wang, Xinrun
    Yu, Runsheng
    He, Xu
    Wang, Rundong
    An, Bo
    Obraztsova, Svetlana
    Rabinovich, Zinovi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [35] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    殷苌茗
    王汉兴
    赵飞
    Applied Mathematics and Mechanics(English Edition), 2007, (03) : 405 - 416
  • [36] State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning
    Ma, Shuai
    Yu, Jia Yuan
    THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, : 4512 - 4519
  • [37] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    Chang-ming Yin
    Wang Han-xing
    Zhao Fei
    Applied Mathematics and Mechanics, 2007, 28 : 405 - 416
  • [38] Gradient-Based Inverse Risk-Sensitive Reinforcement Learning
    Mazumdar, Eric
    Ratliff, Lillian J.
    Fiez, Tanner
    Sastry, S. Shankar
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [39] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P
    Wysotzki, F
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2005, 24 : 81 - 108
  • [40] Risk-sensitive reinforcement learning applied to control under constraints
    Geibel, P. (PGEIBEL@UOS.DE), 1600, American Association for Artificial Intelligence (24):