paper Exploring Contextual Representation and Multi-modality for End-to-end Autonomous Driving

被引:1
|
作者
Azam, Shoaib [1 ,2 ]
Munir, Farzeen [1 ,2 ]
Kyrki, Ville [1 ,2 ]
Kucner, Tomasz Piotr [1 ,2 ]
Jeon, Moongu [3 ]
Pedrycz, Witold [4 ,5 ,6 ]
机构
[1] Aalto Univ, Dept Elect Engn & Automat, Espoo, Finland
[2] Finnish Ctr Artificial Intelligence, Espoo, Finland
[3] Gwangju Inst Sci & Technol, Sch Elect Engn & Comp Sci, Gwangju 61005, South Korea
[4] Univ Alberta, Dept Elect & Comp Engn, Edmonton, AB T6R 2V4, Canada
[5] King Abdulaziz Univ, Fac Engn, Dept Elect & Comp Engn, Jeddah 21589, Saudi Arabia
[6] Polish Acad Sci, Syst Res Inst, PL-01447 Warsaw, Poland
基金
芬兰科学院;
关键词
Vision-centric autonomous driving; Attention; Contextual representation; Imitation learning; Vision transformer;
D O I
10.1016/j.engappai.2024.108767
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning contextual and spatial environmental representations enhances autonomous vehicle's hazard anticipation and decision -making in complex scenarios. Recent perception systems enhance spatial understanding with sensor fusion but often lack global environmental context. Humans, when driving, naturally employ neural maps that integrate various factors such as historical data, situational subtleties, and behavioral predictions of other road users to form a rich contextual understanding of their surroundings. This neural map -based comprehension is integral to making informed decisions on the road. In contrast, even with their significant advancements, autonomous systems have yet to fully harness this depth of human -like contextual understanding. Motivated by this, our work draws inspiration from human driving patterns and seeks to formalize the sensor fusion approach within an end -to -end autonomous driving framework. We introduce a framework that integrates three cameras (left, right, and center) to emulate the human field of view, coupled with top -down bird -eye -view semantic data to enhance contextual representation. The sensor data is fused and encoded using a self -attention mechanism, leading to an auto -regressive waypoint prediction module. We treat feature representation as a sequential problem, employing a vision transformer to distill the contextual interplay between sensor modalities. The efficacy of the proposed method is experimentally evaluated in both open and closed -loop settings. Our method achieves displacement error by 0 . 67 m in open -loop settings, surpassing current methods by 6.9% on the nuScenes dataset. In closed -loop evaluations on CARLA's Town05 Long and Longest6 benchmarks, the proposed method enhances driving performance, route completion, and reduces infractions.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Multi-modality end-to-end audit by the ACDS
    Lye, J.
    Gibbons, F.
    Shaw, M.
    Alves, A.
    Keehan, S.
    Williams, I.
    [J]. RADIOTHERAPY AND ONCOLOGY, 2017, 123 : S966 - S967
  • [2] End-to-end Autonomous Driving Perception with Sequential Latent Representation Learning
    Chen, Jianyu
    Xu, Zhuo
    Tomizuka, Masayoshi
    [J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 1999 - 2006
  • [3] End-to-end testing for stereotactic radiotherapy including the development of a MULTI-MODALITY phantom
    Shariff, Maya
    Grigo, Johanna
    Masitho, Siti
    Brandt, Tobias
    Weiss, Alexander
    Lambrecht, Ulrike
    Stillkrieg, Willi
    Lotter, Michael
    Putz, Florian
    Fietkau, Rainer
    Bert, Christoph
    [J]. ZEITSCHRIFT FUR MEDIZINISCHE PHYSIK, 2024, 34 (03): : 477 - 484
  • [4] Multimodal End-to-End Autonomous Driving
    Xiao, Yi
    Codevilla, Felipe
    Gurram, Akhil
    Urfalioglu, Onay
    Lopez, Antonio M.
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (01) : 537 - 547
  • [5] Adversarial Driving: Attacking End-to-End Autonomous Driving
    Wu, Han
    Yunas, Syed
    Rowlands, Sareh
    Ruan, Wenjie
    Wahlstrom, Johan
    [J]. 2023 IEEE INTELLIGENT VEHICLES SYMPOSIUM, IV, 2023,
  • [6] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
    Prakash, Aditya
    Chitta, Kashyap
    Geiger, Andreas
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7073 - 7083
  • [7] Multi-task Learning with Attention for End-to-end Autonomous Driving
    Ishihara, Keishi
    Kanervisto, Anssi
    Miura, Jun
    Hautamaki, Ville
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2896 - 2905
  • [8] Multi-modal policy fusion for end-to-end autonomous driving
    Huang, Zhenbo
    Sun, Shiliang
    Zhao, Jing
    Mao, Liang
    [J]. INFORMATION FUSION, 2023, 98
  • [9] RobustE2E: Exploring the Robustness of End-to-End Autonomous Driving
    Jiang, Wei
    Wang, Lu
    Zhang, Tianyuan
    Chen, Yuwei
    Dong, Jian
    Bao, Wei
    Zhang, Zichao
    Fu, Qiang
    [J]. ELECTRONICS, 2024, 13 (16)
  • [10] End-to-End Urban Autonomous Driving With Safety Constraints
    Hou, Changmeng
    Zhang, Wei
    [J]. IEEE ACCESS, 2024, 12 : 132198 - 132209