Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引:0
|
作者
Wu, Jiaxu [1 ]
Wang, Yusheng [1 ]
Asama, Hajime [1 ]
An, Qi [2 ]
Yamashita, Atsushi [2 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan
[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan
关键词
COLLISION-AVOIDANCE;
D O I
10.1109/IROS55552.2023.10341948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.
引用
收藏
页码:7456 / 7462
页数:7
相关论文
共 50 条
  • [41] Risk-Sensitive Reinforcement Learning for URLLC Traffic in Wireless Networks
    Ben Khalifa, Nesrine
    Assaad, Mohamad
    Debbah, Merouane
    2019 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE (WCNC), 2019,
  • [42] Risk-Sensitive Portfolio Management by using Distributional Reinforcement Learning
    Harnpadungkij, Thammasorn
    Chaisangmongkon, Warasinee
    Phunchongharn, Phond
    2019 IEEE 10TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST 2019), 2019, : 110 - 115
  • [43] Risk-sensitive reinforcement learning algorithms with generalized average criterion
    Yin Chang-ming
    Wang Han-xing
    Zhao Fei
    APPLIED MATHEMATICS AND MECHANICS-ENGLISH EDITION, 2007, 28 (03) : 405 - 416
  • [44] Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach
    Fei, Yingjie
    Yang, Zhuoran
    Wang, Zhaoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [45] Deep Reinforcement Learning of Navigation in a Complex and Crowded Environment with a Limited Field of View
    Choi, Jinyoung
    Park, Kyungsik
    Kim, Minsu
    Seok, Sangok
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 5993 - 6000
  • [46] Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning
    Lin, Yudeng
    Zhang, Qingtian
    Gao, Bin
    Tang, Jianshi
    Yao, Peng
    Li, Chongxuan
    Huang, Shiyu
    Liu, Zhengwu
    Zhou, Ying
    Liu, Yuyi
    Zhang, Wenqiang
    Zhu, Jun
    Qian, He
    Wu, Huaqiang
    NATURE MACHINE INTELLIGENCE, 2023, 5 (07) : 714 - +
  • [47] Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning
    Yudeng Lin
    Qingtian Zhang
    Bin Gao
    Jianshi Tang
    Peng Yao
    Chongxuan Li
    Shiyu Huang
    Zhengwu Liu
    Ying Zhou
    Yuyi Liu
    Wenqiang Zhang
    Jun Zhu
    He Qian
    Huaqiang Wu
    Nature Machine Intelligence, 2023, 5 : 714 - 723
  • [48] Risk-sensitive inverse reinforcement learning via semi- and non-parametric methods
    Singh, Sumeet
    Lacotte, Jonathan
    Majumdar, Anirudha
    Pavone, Marco
    INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2018, 37 (13-14): : 1713 - 1740
  • [49] Using the GTSOM Network for Mobile Robot Navigation with Reinforcement Learning
    Menegaz, Mauricio
    Engel, Paulo M.
    IJCNN: 2009 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, VOLS 1- 6, 2009, : 716 - 720
  • [50] Deep Reinforcement Learning Based Mobile Robot Navigation: A Review
    Zhu, Kai
    Zhang, Tao
    TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (05) : 674 - 691