Risk-Sensitive Mobile Robot Navigation in Crowded Environment via Offline Reinforcement Learning

被引:0
|
作者
Wu, Jiaxu [1 ]
Wang, Yusheng [1 ]
Asama, Hajime [1 ]
An, Qi [2 ]
Yamashita, Atsushi [2 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Precis Engn, 7-3-1 Hongo,Bunkyo Ku, Tokyo 1138656, Japan
[2] Univ Tokyo, Grad Sch Frontier Sci, Dept Human & Engn Environm Studies, 5-1-5 Kashiwanoha, Kashiwa, Chiba 2778563, Japan
关键词
COLLISION-AVOIDANCE;
D O I
10.1109/IROS55552.2023.10341948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Mobile robot navigation in a human-populated environment has been of great interest to the research community in recent years, referred to as crowd navigation. Currently, offline reinforcement learning (RL)-based method has been introduced to this domain, for its ability to alleviate the sim2real gap brought by online RL which relies on simulators to execute training, and its scalability to use the same dataset to train for differently customized rewards. However, the performance of the navigation policy suffered from the distributional shift between the training data and the input during deployment, since when it gets an input out of the training data distribution, the learned policy has the risk of choosing an erroneous action that leads to catastrophic failure such as colliding with a human. To realize risk sensitivity and improve the safety of the offline RL agent during deployment, this work proposes a multipolicy control framework that combines offline RL navigation policy with a risk detector and a force-based risk-avoiding policy. In particular, a Lyapunov density model is learned using the latent feature of the offline RL policy and works as a risk detector to switch the control to the risk-avoiding policy when the robot has a tendency to go out of the area supported by the training data. Experimental results showed that the proposed method was able to learn navigation in a crowded scene from the offline trajectory dataset and the risk detector substantially reduces the collision rate of the vanilla offline RL agent while maintaining the navigation efficiency outperforming the state-of-the-art methods.
引用
收藏
页码:7456 / 7462
页数:7
相关论文
共 50 条
  • [1] Risk-Sensitive Reinforcement Learning
    Shen, Yun
    Tobia, Michael J.
    Sommer, Tobias
    Obermayer, Klaus
    NEURAL COMPUTATION, 2014, 26 (07) : 1298 - 1328
  • [2] Risk-sensitive reinforcement learning
    Mihatsch, O
    Neuneier, R
    MACHINE LEARNING, 2002, 49 (2-3) : 267 - 290
  • [3] Risk-Sensitive Reinforcement Learning
    Oliver Mihatsch
    Ralph Neuneier
    Machine Learning, 2002, 49 : 267 - 290
  • [4] Risk-sensitive Inverse Reinforcement Learning via Coherent Risk Models
    Majumdar, Anirudha
    Singh, Sumeet
    Mandlekar, Ajay
    Pavone, Marco
    ROBOTICS: SCIENCE AND SYSTEMS XIII, 2017,
  • [5] Risk-Sensitive Reinforcement Learning via Policy Gradient Search
    Prashanth, L. A.
    Fu, Michael C.
    FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2022, 15 (05): : 537 - 693
  • [6] Robot Navigation in Crowded Environments: A Reinforcement Learning Approach
    Caruso, Matteo
    Regolin, Enrico
    Verdu, Federico Julian Camerota
    Russo, Stefano Alberto
    Bortolussi, Luca
    Seriani, Stefano
    MACHINES, 2023, 11 (02)
  • [7] Inverse Risk-Sensitive Reinforcement Learning
    Ratliff, Lillian J.
    Mazumdar, Eric
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (03) : 1256 - 1263
  • [8] A Comprehensive Review of Mobile Robot Navigation Using Deep Reinforcement Learning Algorithms in Crowded Environments
    Le, Hoangcong
    Saeedvand, Saeed
    Hsu, Chen-Chien
    Journal of Intelligent and Robotic Systems: Theory and Applications, 2024, 110 (04):
  • [9] Risk-Sensitive Reinforcement Learning Via Entropic-VaR Optimization
    Ni, Xinyi
    Lai, Lifeng
    2022 56TH ASILOMAR CONFERENCE ON SIGNALS, SYSTEMS, AND COMPUTERS, 2022, : 953 - 959
  • [10] Robot Navigation in Crowded Environments Using Deep Reinforcement Learning
    Liu, Lucia
    Dugas, Daniel
    Cesari, Gianluca
    Siegwart, Roland
    Dube, Renaud
    2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 5671 - 5677