CAF-RCNN: multimodal 3D object detection with cross-attention

被引:0
|
作者
Liu, Junting [1 ]
Liu, Deer [1 ,2 ]
Zhu, Lei [1 ]
机构
[1] Jiangxi Univ Sci & Technol, Sch Civil & Surveying & Mapping Engn, Ganzhou, Jiangxi, Peoples R China
[2] Jiangxi Univ Sci & Technol, Sch Civil & Surveying & Mapping Engn, Ganzhou 341400, Jiangxi, Peoples R China
基金
中国国家自然科学基金;
关键词
3D object detection; multimodal fusion; cross-attention mechanism; feature pyramid network;
D O I
10.1080/01431161.2023.2261151
中图分类号
TP7 [遥感技术];
学科分类号
081102 ; 0816 ; 081602 ; 083002 ; 1404 ;
摘要
LiDAR and camera are pivotal sensors of 3D (three-dimensional) object detection. As a result of their different characteristics, increasingly multimodal-based object detection methods have been proposed. Now, popular methods are to hardly associate camera features with LiDAR features, but the features are frequently enhanced and aggregated, so there is a major challenge in how to align two features effectively. Therefore, we propose CAF-RCNN. On the basis of PointRCNN, using Feature Pyramid Network (FPN) to extract advanced semantic features at different scales, then fusing these features with the LiDAR features of the Set Abstraction (SA) module output in PointRCNN and subsequent steps. Regarding the features fusion module, we design a module based on the cross-attention mechanism, CAFM (Cross-Attention Fusion Module). It combines two channel attention streams in a cross-over fashion to utilize rich details about significant objects in the Image Stream and Geometric Stream. We did a lot of experiments on the KITTI dataset, and the result shows that our method is 6.43% higher than PointRCNN in 3D accuracy.
引用
收藏
页码:6131 / 6146
页数:16
相关论文
共 50 条
  • [31] An efficient 3D object detection method based on Fast Guided Anchor Stereo RCNN
    Tao, Chongben
    Cao, Chunlin
    Cheng, Hanjing
    Gao, Zhen
    Luo, Xizhao
    Zhang, Zuofeng
    Zheng, Sifa
    ADVANCED ENGINEERING INFORMATICS, 2023, 57
  • [32] VISTA: Boosting 3D Object Detection via Dual Cross-VIew SpaTial Attention
    Deng, Shengheng
    Liang, Zhihao
    Sun, Lin
    Jia, Kui
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8438 - 8447
  • [33] Cross-Attention Regression Flow for Defect Detection
    Liu, Binhui
    Guo, Tianchu
    Luo, Bin
    Cui, Zhen
    Yang, Jian
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 5183 - 5193
  • [34] Investigating Attention Mechanism in 3D Point Cloud Object Detection
    Qiu, Shi
    Wu, Yunfan
    Anwar, Saeed
    Li, Chongyi
    2021 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2021), 2021, : 403 - 412
  • [35] Attention-based Proposals Refinement for 3D Object Detection
    Minh-Quan Dao
    Hery, Elwan
    Fremont, Vincent
    2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 197 - 205
  • [36] 3D Object Detection with Attention: Shell-Based Modeling
    Zhang X.
    Zhao Z.
    Sun W.
    Cui Q.
    Computer Systems Science and Engineering, 2023, 46 (01): : 537 - 550
  • [37] ARPNET: attention region proposal network for 3D object detection
    Yangyang Ye
    Chi Zhang
    Xiaoli Hao
    Science China Information Sciences, 2019, 62
  • [38] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, 67 (07) : 2176 - 2190
  • [39] Image attention transformer network for indoor 3D object detection
    Ren, Keyan
    Yan, Tong
    Hu, Zhaoxin
    Han, Honggui
    Zhang, Yunlu
    SCIENCE CHINA-TECHNOLOGICAL SCIENCES, 2024, 67 (07) : 2176 - 2190
  • [40] Image attention transformer network for indoor 3D object detection
    REN KeYan
    YAN Tong
    HU ZhaoXin
    HAN HongGui
    ZHANG YunLu
    Science China(Technological Sciences), 2024, (07) : 2176 - 2190