DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion

被引：5

作者：

Bi, Jiangfeng ^{[1
]}

Wei, Haiyue ^{[1
]}

Zhang, Guoxin ^{[1
]}

Yang, Kuihe ^{[1
]}

Song, Ziying ^{[2
]}

机构：

[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China

[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China

来源：

IEEE LATIN AMERICA TRANSACTIONS | 2024年 / 22卷 / 02期

关键词：

cross-attention dynamic fusion; synchronous data augmentation; 3D object detection; CNN;

D O I：

10.1109/TLA.2024.10412035

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the realm of autonomous driving, LiDAR and camera sensors play an indispensable role, furnishing pivotal observational data for the critical task of precise 3D object detection. Existing fusion algorithms effectively utilize the complementary data from both sensors. However, these methods typically concatenate the raw point cloud data and pixel-level image features, unfortunately, a process that introduces errors and results in the loss of critical information embedded in each modality. To mitigate the problem of lost feature information, this paper proposes a Cross-Attention Dynamic Fusion (CADF) strategy that dynamically fuses the two heterogeneous data sources. In addition, we acknowledge the issue of insufficient data augmentation for these two diverse modalities. To combat this, we propose a Synchronous Data Augmentation (SDA) strategy designed to enhance training efficiency. We have tested our method using the KITTI and nuScenes datasets, and the results have been promising. Remarkably, our top-performing model attained an 82.52% mAP on the KITTI test benchmark, outperforming other state-of-the-art methods.

引用

页码：106 / 112

页数：7

共 50 条

[21] Voxel Field Fusion for 3D Object Detection
Li, Yanwei
Qi, Xiaojuan
Chen, Yukang
Wang, Liwei
Li, Zeming
Sun, Jian
Jia, Jiaya
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1110 - 1119
[22] Fully Sparse Fusion for 3D Object Detection
Li Y.
Fan L.
Liu Y.
Huang Z.
Chen Y.
Wang N.
Zhang Z.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46 (11) : 1 - 15
[23] Radar Voxel Fusion for 3D Object Detection
Nobis, Felix
Shafiei, Ehsan
Karle, Phillip
Betz, Johannes
Lienkamp, Markus
APPLIED SCIENCES-BASEL, 2021, 11 (12):
[24] Dense projection fusion for 3D object detection
Zhao Chen
Bin-Jie Hu
Chengxi Luo
Guohao Chen
Haohui Zhu
Scientific Reports, 14 (1)
[25] A multilevel fusion network for 3D object detection
Xia, Chunlong
Wei, Ping
Wei, Wenwen
Zheng, Nanning
Neurocomputing, 2021, 437 : 107 - 117
[26] MoEmo Vision Transformer: Integrating Cross-Attention and Movement Vectors in 3D Pose Estimation for HRI Emotion Detection
Jeong, David C.
Shen, Tianma
Liu, Hongji
Kapoor, Raghav
Nguyen, Casey
Liu, Song
Kitts, Christopher A.
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 9846 - 9852
[27] Object Detection in Multispectral Remote Sensing Images Based on Cross-Modal Cross-Attention
Zhao, Pujie
Ye, Xia
Du, Ziang
SENSORS, 2024, 24 (13)
[28] PointGAT: Graph attention networks for 3D object detection
Zhou H.
Wang W.
Liu G.
Zhou Q.
Intelligent and Converged Networks, 2022, 3 (02): : 204 - 216
[29] Cross-Modality 3D Object Detection
Zhu, Ming
Ma, Chao
Ji, Pan
Yang, Xiaokang
2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 3771 - 3780
[30] Dynamic graph transformer for 3D object detection
Ren, Siyuan
Pan, Xiao
Zhao, Wenjie
Nie, Binling
Han, Bo
KNOWLEDGE-BASED SYSTEMS, 2023, 259

← 1 2 3 4 5 →