DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion

被引:5
|
作者
Bi, Jiangfeng [1 ]
Wei, Haiyue [1 ]
Zhang, Guoxin [1 ]
Yang, Kuihe [1 ]
Song, Ziying [2 ]
机构
[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China
[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China
关键词
cross-attention dynamic fusion; synchronous data augmentation; 3D object detection; CNN;
D O I
10.1109/TLA.2024.10412035
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the realm of autonomous driving, LiDAR and camera sensors play an indispensable role, furnishing pivotal observational data for the critical task of precise 3D object detection. Existing fusion algorithms effectively utilize the complementary data from both sensors. However, these methods typically concatenate the raw point cloud data and pixel-level image features, unfortunately, a process that introduces errors and results in the loss of critical information embedded in each modality. To mitigate the problem of lost feature information, this paper proposes a Cross-Attention Dynamic Fusion (CADF) strategy that dynamically fuses the two heterogeneous data sources. In addition, we acknowledge the issue of insufficient data augmentation for these two diverse modalities. To combat this, we propose a Synchronous Data Augmentation (SDA) strategy designed to enhance training efficiency. We have tested our method using the KITTI and nuScenes datasets, and the results have been promising. Remarkably, our top-performing model attained an 82.52% mAP on the KITTI test benchmark, outperforming other state-of-the-art methods.
引用
收藏
页码:106 / 112
页数:7
相关论文
共 50 条
  • [21] Voxel Field Fusion for 3D Object Detection
    Li, Yanwei
    Qi, Xiaojuan
    Chen, Yukang
    Wang, Liwei
    Li, Zeming
    Sun, Jian
    Jia, Jiaya
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1110 - 1119
  • [22] Fully Sparse Fusion for 3D Object Detection
    Li Y.
    Fan L.
    Liu Y.
    Huang Z.
    Chen Y.
    Wang N.
    Zhang Z.
    IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (11) : 1 - 15
  • [23] Radar Voxel Fusion for 3D Object Detection
    Nobis, Felix
    Shafiei, Ehsan
    Karle, Phillip
    Betz, Johannes
    Lienkamp, Markus
    APPLIED SCIENCES-BASEL, 2021, 11 (12):
  • [24] Dense projection fusion for 3D object detection
    Zhao Chen
    Bin-Jie Hu
    Chengxi Luo
    Guohao Chen
    Haohui Zhu
    Scientific Reports, 14 (1)
  • [25] A multilevel fusion network for 3D object detection
    Xia, Chunlong
    Wei, Ping
    Wei, Wenwen
    Zheng, Nanning
    Neurocomputing, 2021, 437 : 107 - 117
  • [26] MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection
    Jeong, David C.
    Shen, Tianma
    Liu, Hongji
    Kapoor, Raghav
    Nguyen, Casey
    Liu, Song
    Kitts, Christopher A.
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9846 - 9852
  • [27] Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention
    Zhao, Pujie
    Ye, Xia
    Du, Ziang
    SENSORS, 2024, 24 (13)
  • [28] PointGAT: Graph attention networks for 3D object detection
    Zhou H.
    Wang W.
    Liu G.
    Zhou Q.
    Intelligent and Converged Networks, 2022, 3 (02): : 204 - 216
  • [29] Cross-Modality 3D Object Detection
    Zhu, Ming
    Ma, Chao
    Ji, Pan
    Yang, Xiaokang
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3771 - 3780
  • [30] Dynamic graph transformer for 3D object detection
    Ren, Siyuan
    Pan, Xiao
    Zhao, Wenjie
    Nie, Binling
    Han, Bo
    KNOWLEDGE-BASED SYSTEMS, 2023, 259