Multi-modal policy fusion for end-to-end autonomous driving

被引：15

作者：

Huang, Zhenbo ^{[1
]}

Sun, Shiliang ^{[1
]}

Zhao, Jing ^{[1
]}

Mao, Liang ^{[1
]}

机构：

[1] East China Normal Univ, Sch Comp Sci & Technol, 3663 North Zhongshan Rd, Shanghai 200062, Peoples R China

来源：

INFORMATION FUSION | 2023年 / 98卷

基金：

中国国家自然科学基金;

关键词：

Multi-modal policy fusion; Autonomous driving; Reinforcement learning; Robust fused policy;

D O I：

10.1016/j.inffus.2023.101834

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Multi-modal learning has made impressive progress in autonomous driving by leveraging information from multiple sensors. Existing feature fusion methods make decisions by integrating perceptions from different sensors. However, autonomous driving systems could be risky since the fused feature are unreliable when one of the sensors fails. Moreover, these methods require either sophisticated geometric designs to align features or complex neural networks to effectively fuse features, significantly increasing the training cost. In this paper, we propose PolicyFuser, a policy fusion method for end-to-end autonomous driving to address these issues. PolicyFuser retains an independent decision for each sensor, and no feature alignment or complex neural networks are required. To focus on the best policy, we use reinforcement learning to select the action with the highest Q-value as the primary decision, and the remaining actions as the secondary decisions. Then the secondary decisions are used to fine-tune the primary decision through a primary and secondary policy fusion (PSF) module. To bridge the gap between the decisions from different sensors and improve the stability of policy fusion, we use a conditional variational autoencoder (CVAE) to generate pseudo-expert decisions. We demonstrate the effectiveness of our method in CARLA, and our method achieves the highest driving scores and handles sensor failures with excellence.

引用

页数：11

共 50 条

[1] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
Prakash, Aditya
Chitta, Kashyap
Geiger, Andreas
[J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7073 - 7083
[2] CrossFuser: Multi-Modal Feature Fusion for End-to-End Autonomous Driving Under Unseen Weather Conditions
Wu, Weishang
Deng, Xiaoheng
Jiang, Ping
Wan, Shaohua
Guo, Yuanxiong
[J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 14378 - 14392
[3] Multi-Modal Sensor Fusion-Based Deep Neural Network for End-to-End Autonomous Driving With Scene Understanding
Huang, Zhiyu
Lv, Chen
Xing, Yang
Wu, Jingda
[J]. IEEE SENSORS JOURNAL, 2021, 21 (10) : 11781 - 11790
[4] Multi-Modal Fusion for End-to-End RGB-T Tracking
Zhang, Lichao
Danelljan, Martin
Gonzalez-Garcia, Abel
van de Weijer, Joost
Khan, Fahad Shahbaz
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 2252 - 2261
[5] MMFN: Multi-Modal-Fusion-Net for End-to-End Driving
Zhang, Qingwen
Tang, Mingkai
Geng, Ruoyu
Chen, Feiyi
Xin, Ren
Wang, Lujia
[J]. 2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 8638 - 8643
[6] SymmetricNet: end-to-end mesoscale eddy detection with multi-modal data fusion
Zhao, Yuxiao
Fan, Zhenlin
Li, Haitao
Zhang, Rui
Xiang, Wei
Wang, Shengke
Zhong, Guoqiang
[J]. FRONTIERS IN MARINE SCIENCE, 2023, 10
[7] Multi-Modal Data Augmentation for End-to-End ASR
Renduchintala, Adithya
Ding, Shuoyang
Wiesner, Matthew
Watanabe, Shinji
[J]. 19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2394 - 2398
[8] End-to-end Knowledge Retrieval with Multi-modal Queries
Luo, Man
Fang, Zhiyuan
Gokhale, Tejas
Yang, Yezhou
Baral, Chitta
[J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 8573 - 8589
[9] End-to-end Multi-modal Video Temporal Grounding
Chen, Yi-Wen
Tsai, Yi-Hsuan
Yang, Ming-Hsuan
[J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
[10] End-to-End Compound Table Understanding with Multi-Modal Modeling
Li, Zaisheng
Li, Yi
Liang, Qiao
Li, Pengfei
Cheng, Zhanzhan
Niu, Yi
Pu, Shiliang
Li, Xi
[J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 4112 - 4121

← 1 2 3 4 5 →