CAF-RCNN: multimodal 3D object detection with cross-attention

被引：0

作者：

Liu, Junting ^{[1
]}

Liu, Deer ^{[1
,2
]}

Zhu, Lei ^{[1
]}

机构：

[1] Jiangxi Univ Sci & Technol, Sch Civil & Surveying & Mapping Engn, Ganzhou, Jiangxi, Peoples R China

[2] Jiangxi Univ Sci & Technol, Sch Civil & Surveying & Mapping Engn, Ganzhou 341400, Jiangxi, Peoples R China

来源：

INTERNATIONAL JOURNAL OF REMOTE SENSING | 2023年 / 44卷 / 19期

基金：

中国国家自然科学基金;

关键词：

3D object detection; multimodal fusion; cross-attention mechanism; feature pyramid network;

D O I：

10.1080/01431161.2023.2261151

中图分类号：

TP7 [遥感技术];

学科分类号：

081102 ; 0816 ; 081602 ; 083002 ; 1404 ;

摘要：

LiDAR and camera are pivotal sensors of 3D (three-dimensional) object detection. As a result of their different characteristics, increasingly multimodal-based object detection methods have been proposed. Now, popular methods are to hardly associate camera features with LiDAR features, but the features are frequently enhanced and aggregated, so there is a major challenge in how to align two features effectively. Therefore, we propose CAF-RCNN. On the basis of PointRCNN, using Feature Pyramid Network (FPN) to extract advanced semantic features at different scales, then fusing these features with the LiDAR features of the Set Abstraction (SA) module output in PointRCNN and subsequent steps. Regarding the features fusion module, we design a module based on the cross-attention mechanism, CAFM (Cross-Attention Fusion Module). It combines two channel attention streams in a cross-over fashion to utilize rich details about significant objects in the Image Stream and Geometric Stream. We did a lot of experiments on the KITTI dataset, and the result shows that our method is 6.43% higher than PointRCNN in 3D accuracy.

引用

页码：6131 / 6146

页数：16

共 50 条

[31] An efficient 3D object detection method based on Fast Guided Anchor Stereo RCNN
Tao, Chongben
Cao, Chunlin
Cheng, Hanjing
Gao, Zhen
Luo, Xizhao
Zhang, Zuofeng
Zheng, Sifa
ADVANCED ENGINEERING INFORMATICS, 2023, 57
[32] VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
Deng, Shengheng
Liang, Zhihao
Sun, Lin
Jia, Kui
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8438 - 8447
[33] Cross-Attention Regression Flow for Defect Detection
Liu, Binhui
Guo, Tianchu
Luo, Bin
Cui, Zhen
Yang, Jian
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5183 - 5193
[34] Investigating Attention Mechanism in 3D Point Cloud Object Detection
Qiu, Shi
Wu, Yunfan
Anwar, Saeed
Li, Chongyi
2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 403 - 412
[35] Attention-based Proposals Refinement for 3D Object Detection
Minh-Quan Dao
Hery, Elwan
Fremont, Vincent
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 197 - 205
[36] 3D Object Detection with Attention: Shell-Based Modeling
Zhang X.
Zhao Z.
Sun W.
Cui Q.
Computer Systems Science and Engineering, 2023, 46 (01): : 537 - 550
[37] ARPNET: attention region proposal network for 3D object detection
Yangyang Ye
Chi Zhang
Xiaoli Hao
Science China Information Sciences, 2019, 62
[38] Image attention transformer network for indoor 3D object detection
REN KeYan
YAN Tong
HU ZhaoXin
HAN HongGui
ZHANG YunLu
Science China(Technological Sciences), 2024, 67 (07) : 2176 - 2190
[39] Image attention transformer network for indoor 3D object detection
Ren, Keyan
Yan, Tong
Hu, Zhaoxin
Han, Honggui
Zhang, Yunlu
SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (07) : 2176 - 2190
[40] Image attention transformer network for indoor 3D object detection
REN KeYan
YAN Tong
HU ZhaoXin
HAN HongGui
ZHANG YunLu
Science China(Technological Sciences), 2024, (07) : 2176 - 2190

← 1 2 3 4 5 →