Cross-Observability Optimistic-Pessimistic Safe Reinforcement Learning for Interactive Motion Planning With Visual Occlusion

被引：1

作者：

Hou, Xiaohui ^{[1
]}

Gan, Minggang ^{[1
]}

Wu, Wei ^{[1
]}

Ji, Yuan ^{[2
]}

Zhao, Shiyue ^{[3
]}

Chen, Jie ^{[4
]}

机构：

[1] Beijing Inst Technol, Sch Automat, Natl Key Lab Autonomous Intelligent Unmanned Syst, Beijing 100081, Peoples R China

[2] Nanyang Technol Univ, Sch Mech & Aerosp Engn, Singapore 639798, Singapore

[3] Tsinghua Univ, Sch Vehicle & Mobil, Beijing 100084, Peoples R China

[4] Tongji Univ, Natl Key Lab Autonomous Intelligent Unmanned Syst, Shanghai 201804, Peoples R China

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2024年 / 25卷 / 11期

基金：

中国国家自然科学基金;

关键词：

Planning; Vehicle dynamics; Safety; Visualization; Autonomous vehicles; Uncertainty; Reinforcement learning; Motion planning; autonomous vehicles; reinforcement learning; risk evaluation; visual occlusion; PREDICTION;

D O I：

10.1109/TITS.2024.3443397

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

This study focuses on the motion planning and risk evaluation of unprotected left turns at occluded intersections for autonomous vehicles. In this paper, we present an interactive motion planning controller that combines Cross-Observability Optimistic-Pessimistic Safe Reinforcement Learning (COOP-SRL) and Nonlinear Model Predictive Control (NMPC), with consideration of the uncertain potential risk of occluded zone, the trade-off between safety and efficiency, and the dynamic interaction between vehicles. The proposed COOP-SRL algorithm integrates fully and partially observable policies through cross-observability soft imitation learning to leverage the expert guidance and improve learning efficiency. Moreover, the optimistic exploration policy and pessimism safe constraint are adopted to provide an adaptive safe strategy without hindering the exploration during learning process. Finally, the evaluations of the proposed controller were conducted in occluded intersection scenarios with various traffic density level, which indicate that the proposed method outperforms both the optimization-based and learning-based baselines in qualitative and quantitative indexes.

引用

页码：17602 / 17613

页数：12

共 14 条

[1] Multi-Agent Reinforcement Learning Algorithm with Variable Optimistic-Pessimistic Criterion
Akchurina, Natalia
ECAI 2008, PROCEEDINGS, 2008, 178 : 433 - +
[2] DOPE: Doubly Optimistic and Pessimistic Exploration for Safe Reinforcement Learning
Bura, Archana
HasanzadeZonuzy, Aria
Kalathil, Dileep
Shakkottai, Srinivas
Chamberland, Jean-Francois
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
[3] Risk assessment and interactive motion planning with visual occlusion using graph attention networks and reinforcement learning
Hou, Xiaohui
Gan, Minggang
Wu, Wei
Zhao, Tiantong
Chen, Jie
ADVANCED ENGINEERING INFORMATICS, 2024, 62
[4] Optimistic Reinforcement Learning-Based Skill Insertions for Task and Motion Planning
Liu, Gaoyuan
de Winter, Joris
Durodie, Yuri
Steckelmacher, Denis
Nowe, Ann
Vanderborght, Bram
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (06): : 5974 - 5981
[5] Safe Reinforcement Learning With Stability Guarantee for Motion Planning of Autonomous Vehicles
Zhang, Lixian
Zhang, Ruixian
Wu, Tong
Weng, Rui
Han, Minghao
Zhao, Ye
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (12) : 5435 - 5444
[6] Distributed safe reinforcement learning for multi-robot motion planning
Lu, Yang
Guo, Yaohua
Zhao, Guoxiang
Zhu, Minghui
2021 29TH MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION (MED), 2021, : 1209 - 1214
[7] Merging planning in dense traffic scenarios using interactive safe reinforcement learning
Hou, Xiaohui
Gan, Minggang
Wu, Wei
Wang, Chenyu
Ji, Yuan
Zhao, Shiyue
KNOWLEDGE-BASED SYSTEMS, 2024, 290
[8] Safe multi-agent motion planning via filtered reinforcement learning
Vinod, Abraham P.
Safaoui, Sleiman
Chakrabarty, Ankush
Quirynen, Rien
Yoshikawa, Nobuyuki
Di Cairano, Stefano
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2022, 2022, : 7270 - 7276
[9] Safe Multiagent Motion Planning Under Uncertainty for Drones Using Filtered Reinforcement Learning
Safaoui, Sleiman
Vinod, Abraham P.
Chakrabarty, Ankush
Quirynen, Rien
Yoshikawa, Nobuyuki
Di Cairano, Stefano
IEEE TRANSACTIONS ON ROBOTICS, 2024, 40 (2529-2542) : 2529 - 2542
[10] Decentralized, Safe, Multiagent Motion Planning for Drones Under Uncertainty via Filtered Reinforcement Learning
Vinod, Abraham P.
Safaoui, Sleiman
Summers, Tyler H.
Yoshikawa, Nobuyuki
Di Cairano, Stefano
IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY, 2024, 32 (06) : 2492 - 2499

← 1 2 →