DyFusion: Cross-Attention 3D Object Detection with Dynamic Fusion

被引：5

作者：

Bi, Jiangfeng ^{[1
]}

Wei, Haiyue ^{[1
]}

Zhang, Guoxin ^{[1
]}

Yang, Kuihe ^{[1
]}

Song, Ziying ^{[2
]}

机构：

[1] Hebei Univ Sci & Technol, Sch Informat Sci & Engn, Shijiazhuang 050018, Peoples R China

[2] Beijing Jiaotong Univ, Sch Comp & Informat Technol, Beijing Key Lab Traff Data Anal & Min, Beijing, Peoples R China

来源：

IEEE LATIN AMERICA TRANSACTIONS | 2024年 / 22卷 / 02期

关键词：

cross-attention dynamic fusion; synchronous data augmentation; 3D object detection; CNN;

D O I：

10.1109/TLA.2024.10412035

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In the realm of autonomous driving, LiDAR and camera sensors play an indispensable role, furnishing pivotal observational data for the critical task of precise 3D object detection. Existing fusion algorithms effectively utilize the complementary data from both sensors. However, these methods typically concatenate the raw point cloud data and pixel-level image features, unfortunately, a process that introduces errors and results in the loss of critical information embedded in each modality. To mitigate the problem of lost feature information, this paper proposes a Cross-Attention Dynamic Fusion (CADF) strategy that dynamically fuses the two heterogeneous data sources. In addition, we acknowledge the issue of insufficient data augmentation for these two diverse modalities. To combat this, we propose a Synchronous Data Augmentation (SDA) strategy designed to enhance training efficiency. We have tested our method using the KITTI and nuScenes datasets, and the results have been promising. Remarkably, our top-performing model attained an 82.52% mAP on the KITTI test benchmark, outperforming other state-of-the-art methods.

引用

页码：106 / 112

页数：7

共 50 条

[11] High-order multilayer attention fusion network for 3D object detection
Zhang, Baowen
Zhao, Yongyong
Su, Chengzhi
Cao, Guohua
ENGINEERING REPORTS, 2024,
[12] 3D Object Detection with Fusion Point Attention Mechanism in LiDAR Point Cloud
Liu Weili
Zhu Deli
Luo Huahao
Li Yi
ACTA PHOTONICA SINICA, 2023, 52 (09)
[13] CASNet: A Cross-Attention Siamese Network for Video Salient Object Detection
Ji, Yuzhu
Zhang, Haijun
Jie, Zequn
Ma, Lin
Wu, Q. M. Jonathan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2021, 32 (06) : 2676 - 2690
[14] FusionPillars: A 3D Object Detection Network with Cross-Fusion and Self-Fusion
Zhang, Jing
Xu, Da
Li, Yunsong
Zhao, Liping
Su, Rui
REMOTE SENSING, 2023, 15 (10)
[15] Cross-Attention of Disentangled Modalities for 3D Human Mesh Recovery with Transformers
Cho, Junhyeong
Youwang, Kim
Oh, Tae-Hyun
COMPUTER VISION - ECCV 2022, PT I, 2022, 13661 : 342 - 359
[16] A joint object detection and semantic segmentation model with cross-attention and inner-attention mechanisms
Nan, Zhixiong
Peng, Jizhi
Jiang, Jingjing
Chen, Hui
Yang, Ben
Xin, Jingmin
Zheng, Nanning
NEUROCOMPUTING, 2021, 463 : 212 - 225
[17] A multilevel fusion network for 3D object detection
Xia, Chunlong
Wei, Ping
Wei, Wenwen
Zheng, Nanning
NEUROCOMPUTING, 2021, 437 : 107 - 117
[18] PointPainting: Sequential Fusion for 3D Object Detection
Vora, Sourabh
Lang, Alex H.
Helou, Bassam
Beijbom, Oscar
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4603 - 4611
[19] Dense Voxel Fusion for 3D Object Detection
Mahmoud, Anas
Hu, Jordan S. K.
Waslander, Steven L.
2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 663 - 672
[20] Sparse Dense Fusion for 3D Object Detection
Gao, Yulu
Sima, Chonghao
Shi, Shaoshuai
Di, Shangzhe
Liu, Si
Li, Hongyang
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10939 - 10946

← 1 2 3 4 5 →