Cross-Observability Optimistic-Pessimistic Safe Reinforcement Learning for Interactive Motion Planning With Visual Occlusion

被引:1
|
作者
Hou, Xiaohui [1 ]
Gan, Minggang [1 ]
Wu, Wei [1 ]
Ji, Yuan [2 ]
Zhao, Shiyue [3 ]
Chen, Jie [4 ]
机构
[1] Beijing Inst Technol, Sch Automat, Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100081, Peoples R China
[2] Nanyang Technol Univ, Sch Mech & Aerosp Engn, Singapore 639798, Singapore
[3] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China
[4] Tongji Univ, Natl Key Lab Autonomous Intelligent Unmanned Syst, Shanghai 201804, Peoples R China
基金
中国国家自然科学基金;
关键词
Planning; Vehicle dynamics; Safety; Visualization; Autonomous vehicles; Uncertainty; Reinforcement learning; Motion planning; autonomous vehicles; reinforcement learning; risk evaluation; visual occlusion; PREDICTION;
D O I
10.1109/TITS.2024.3443397
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
This study focuses on the motion planning and risk evaluation of unprotected left turns at occluded intersections for autonomous vehicles. In this paper, we present an interactive motion planning controller that combines Cross-Observability Optimistic-Pessimistic Safe Reinforcement Learning (COOP-SRL) and Nonlinear Model Predictive Control (NMPC), with consideration of the uncertain potential risk of occluded zone, the trade-off between safety and efficiency, and the dynamic interaction between vehicles. The proposed COOP-SRL algorithm integrates fully and partially observable policies through cross-observability soft imitation learning to leverage the expert guidance and improve learning efficiency. Moreover, the optimistic exploration policy and pessimism safe constraint are adopted to provide an adaptive safe strategy without hindering the exploration during learning process. Finally, the evaluations of the proposed controller were conducted in occluded intersection scenarios with various traffic density level, which indicate that the proposed method outperforms both the optimization-based and learning-based baselines in qualitative and quantitative indexes.
引用
收藏
页码:17602 / 17613
页数:12
相关论文
共 14 条
  • [1] Multi-Agent Reinforcement Learning Algorithm with Variable Optimistic-Pessimistic Criterion
    Akchurina, Natalia
    ECAI 2008, PROCEEDINGS, 2008, 178 : 433 - +
  • [2] DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning
    Bura, Archana
    HasanzadeZonuzy, Aria
    Kalathil, Dileep
    Shakkottai, Srinivas
    Chamberland, Jean-Francois
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Risk assessment and interactive motion planning with visual occlusion using graph attention networks and reinforcement learning
    Hou, Xiaohui
    Gan, Minggang
    Wu, Wei
    Zhao, Tiantong
    Chen, Jie
    ADVANCED ENGINEERING INFORMATICS, 2024, 62
  • [4] Optimistic Reinforcement Learning-Based Skill Insertions for Task and Motion Planning
    Liu, Gaoyuan
    de Winter, Joris
    Durodie, Yuri
    Steckelmacher, Denis
    Nowe, Ann
    Vanderborght, Bram
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 5974 - 5981
  • [5] Safe Reinforcement Learning With Stability Guarantee for Motion Planning of Autonomous Vehicles
    Zhang, Lixian
    Zhang, Ruixian
    Wu, Tong
    Weng, Rui
    Han, Minghao
    Zhao, Ye
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (12) : 5435 - 5444
  • [6] Distributed safe reinforcement learning for multi-robot motion planning
    Lu, Yang
    Guo, Yaohua
    Zhao, Guoxiang
    Zhu, Minghui
    2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 1209 - 1214
  • [7] Merging planning in dense traffic scenarios using interactive safe reinforcement learning
    Hou, Xiaohui
    Gan, Minggang
    Wu, Wei
    Wang, Chenyu
    Ji, Yuan
    Zhao, Shiyue
    KNOWLEDGE-BASED SYSTEMS, 2024, 290
  • [8] Safe multi-agent motion planning via filtered reinforcement learning
    Vinod, Abraham P.
    Safaoui, Sleiman
    Chakrabarty, Ankush
    Quirynen, Rien
    Yoshikawa, Nobuyuki
    Di Cairano, Stefano
    2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7270 - 7276
  • [9] Safe Multiagent Motion Planning Under Uncertainty for Drones Using Filtered Reinforcement Learning
    Safaoui, Sleiman
    Vinod, Abraham P.
    Chakrabarty, Ankush
    Quirynen, Rien
    Yoshikawa, Nobuyuki
    Di Cairano, Stefano
    IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 (2529-2542) : 2529 - 2542
  • [10] Decentralized, Safe, Multiagent Motion Planning for Drones Under Uncertainty via Filtered Reinforcement Learning
    Vinod, Abraham P.
    Safaoui, Sleiman
    Summers, Tyler H.
    Yoshikawa, Nobuyuki
    Di Cairano, Stefano
    IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2024, 32 (06) : 2492 - 2499