MMFG: Multimodal-based Mutual Feature Gating 3D Object Detection

被引:0
|
作者
Xu, Wanpeng [1 ]
Fu, Zhipeng [1 ]
机构
[1] Peng Cheng Lab, Dept New Pattern Network, Xingke 1st St, Shenzhen 518055, Guangdong, Peoples R China
关键词
3D object detection; LiDAR; Multimodal fusion; Gating mechanism;
D O I
10.1007/s10846-024-02119-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
To address the problem that image and point cloud features are fused in a coarse fusion way and cannot achieve deep fusion, this paper proposes a multimodal 3D object detection architecture based on a mutual feature gating mechanism. First, since the feature aggregation approach based on the set abstraction layer cannot obtain fine-grained features, a point-based self-attention mechanism module is designed. This module is added to the extraction branch of point cloud features to achieve fine-grained feature aggregation while maintaining accurate location information. Second, a new gating mechanism is designed for the deep fusion of image and point cloud. Deep fusion is achieved by mutual feature weighting between the image and the point cloud. The newly fused features are then fed into a feature refinement network to extract classification confidence and 3D target bounding boxes. Finally, a multi-scale detection architecture is proposed to obtain a more complete object shape. The location-based encoding feature algorithm is also designed to focus the interest points in the region of interest adaptively. The whole architecture shows outstanding performance on the KITTI3D and nuSenece datasets, especially at the difficult level. It shows that the framework solves the problem of low detection rates in LiDAR mode due to the low number of surface points obtained from distant objects.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] Multimodal 3D Object Detection Method Based on Pseudo Point Cloud Feature Enhancement
    Kong D.-M.
    Li X.-W.
    Yang Q.-X.
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2024, 47 (04): : 759 - 775
  • [2] DMFF: dual-way multimodal feature fusion for 3D object detection
    Dong, Xiaopeng
    Di, Xiaoguang
    Wang, Wenzhuang
    [J]. SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 455 - 463
  • [3] MFF-Net: Multimodal Feature Fusion Network for 3D Object Detection
    Shi, Peicheng
    Liu, Zhiqiang
    Qi, Heng
    Yang, Aixi
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2023, 75 (03): : 5615 - 5637
  • [4] DMFF: dual-way multimodal feature fusion for 3D object detection
    Xiaopeng Dong
    Xiaoguang Di
    Wenzhuang Wang
    [J]. Signal, Image and Video Processing, 2024, 18 (1) : 455 - 463
  • [5] Multimodal 3D Histogram for Moving Object Detection
    Mukherjee, Dibyendu
    Saha, Ashirbani
    Wu, Q. M. Jonathan
    Jiang, Wei
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, : 2397 - 2402
  • [6] Anti-Noise 3D Object Detection of Multimodal Feature Attention Fusion Based on PV-RCNN
    Zhu, Yuan
    Xu, Ruidong
    An, Hao
    Tao, Chongben
    Lu, Ke
    [J]. SENSORS, 2023, 23 (01)
  • [7] Multimodal 3D Object Detection from Simulated Pretraining
    Brekke, Asmund
    Vatsendvik, Fredrik
    Lindseth, Frank
    [J]. NORDIC ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT, 2019, 1056 : 102 - 113
  • [8] Virtual Sparse Convolution for Multimodal 3D Object Detection
    Wu, Hai
    Wen, Chenglu
    Shi, Shaoshuai
    Li, Xin
    Wang, Cheng
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21653 - 21662
  • [9] Multimodal Transformer for Automatic 3D Annotation and Object Detection
    Liu, Chang
    Qian, Xiaoyan
    Huang, Binxiao
    Qi, Xiaojuan
    Lam, Edmund
    Tan, Siew-Chong
    Wong, Ngai
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 657 - 673
  • [10] Semantics feature sampling for point-based 3D object detection
    Huang, Jing-Dong
    Du, Ji-Xiang
    Zhang, Hong-Bo
    Liu, Huai-Jin
    [J]. IMAGE AND VISION COMPUTING, 2024, 149