DMFF: dual-way multimodal feature fusion for 3D object detection

被引:0
|
作者
Dong, Xiaopeng [1 ]
Di, Xiaoguang [1 ]
Wang, Wenzhuang [1 ]
机构
[1] Harbin Inst Technol, Control & Simulat Ctr, Harbin, Peoples R China
基金
黑龙江省自然科学基金;
关键词
3D object detection; Multimodal feature fusion; Self-attention mechanism; Lidar point clouds; RGB images;
D O I
10.1007/s11760-023-02772-z
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recently, multimodal 3D object detection that fuses the complementary information from LiDAR data and RGB images has been an active research topic. However, it is not trivial to fuse images and point clouds because of different representations of them. Inadequate feature fusion also brings bad effects on detection performance. We convert images into pseudo point clouds by using a depth completion and utilize a more efficient feature fusion method to address the problems. In this paper, we propose a dual-way multimodal feature fusion network (DMFF) for 3D object detection. Specifically, we first use a dual stream feature extraction module (DSFE) to generate homogeneous LiDAR and pseudo region of interest (RoI) features. Then, we propose a dual-way feature interaction method (DWFI) that enables intermodal and intramodal interaction of the two features. Next, we design a local attention feature fusion module (LAFF) to select which features of the input are more likely to contribute to the desired output. In addition, the proposed DMFF achieves the state-of-the-art performances on the KITTI Dataset.
引用
收藏
页码:455 / 463
页数:9
相关论文
共 50 条
  • [41] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
    Zuo, Liangyu
    Li, Yaochen
    Han, Mengtao
    Li, Qiao
    Liu, Yuehu
    [J]. 2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
  • [42] Sparse Embedded Convolution Based Dual Feature Aggregation 3D Object Detection Network
    Li, Hai-Sheng
    Lu, Yan-Ling
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (01)
  • [43] Adaptive and azimuth-aware fusion network of multimodal local features for 3D object detection
    Tian, Yonglin
    Wang, Kunfeng
    Wang, Yuang
    Tian, Yulin
    Wang, Zilei
    Wang, Fei-Yue
    [J]. NEUROCOMPUTING, 2020, 411 : 32 - 44
  • [44] DLFusion: Painting-Depth Augmenting-LiDAR for Multimodal Fusion 3D Object Detection
    Wang, Junyin
    Du, Chenghu
    Li, Hui
    Xiong, Shengwu
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3765 - 3776
  • [45] DVFENet: Dual-branch voxel feature extraction network for 3D object detection
    He, Yunqian
    Xia, Guihua
    Luo, Yongkang
    Su, Li
    Zhang, Zhi
    Li, Wanyi
    Wang, Peng
    [J]. NEUROCOMPUTING, 2021, 459 : 201 - 211
  • [46] Sparse Embedded Convolution Based Dual Feature Aggregation 3D Object Detection Network
    Hai-Sheng Li
    Yan-Ling Lu
    [J]. Neural Processing Letters, 56
  • [47] Point-Voxel Fusion for Multimodal 3D Detection
    Wang, Ke
    Zhang, Zhichuang
    [J]. 2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1716 - 1719
  • [48] TBFNT3D: Two-Branch Fusion Network With Transformer for Multimodal Indoor 3D Object Detection
    Cheng, Jun
    Zhang, Sheng
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6523 - 6530
  • [49] GNN-based Point Cloud Maps Feature Extraction and Residual Feature Fusion for 3D Object Detection
    Liao, Wei-Hsiang
    Wang, Chieh-Chih
    Lin, Wen-Chieh
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2023), 2023, : 7010 - 7016
  • [50] M3DGAF: Monocular 3D Object Detection With Geometric Appearance Awareness and Feature Fusion
    Chen, Mu
    Liu, Pengfei
    Zhao, Huaici
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (11) : 11232 - 11240