MAFF-Net: Filter False Positive for 3D Vehicle Detection with Multi-modal Adaptive Feature Fusion

被引:8
|
作者
Zhang, Zehan [1 ,2 ]
Shen, Yuxi [1 ]
Li, Hao [1 ]
Zhao, Xian [1 ]
Yang, Ming [2 ]
Tan, Wenming [1 ]
Pu, ShiLiang [1 ]
Mao, Hui [1 ]
机构
[1] Hangzhou Hikvis Digital Technol Co Ltd, Hikvis Res Inst, Hangzhou, Peoples R China
[2] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
关键词
POINT CLOUD;
D O I
10.1109/ITSC55140.2022.9922104
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D vehicle detection based on multi-modal fusion is an important task of many applications such as autonomous driving. Although significant progress has been made, we still observe two aspects that calls for further improvement: First, what extra information can be obtained from the images to complement the point clouds in 3D detection tasks is seldom explored by previous works. Second, most fusion modules can only be used in their designed network, lacking universality. In this work, we propose PointAttentionFusion and DenseAttentionFusion: two end-to-end trainable single-stage multi-modal feature fusion approaches to adaptively combine RGB and point cloud modalities. Experimental results on the KITTI dataset demonstrate significant improvement in filtering false positive over the approaches using only point cloud data. Furthermore, the proposed methods can provide competitive results compared to the published state-of-the-art multi-modal methods in the KITTI benchmark. Both fusion modules are applicable in all voxel-based 3D detection architectures and similar improvements are expected.
引用
收藏
页码:369 / 376
页数:8
相关论文
共 50 条
  • [1] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [2] Multi-modal feature fusion for 3D object detection in the production workshop
    Hou, Rui
    Chen, Guangzhu
    Han, Yinhe
    Tang, Zaizuo
    Ru, Qingjun
    APPLIED SOFT COMPUTING, 2022, 115
  • [3] MAFF-Net: Multi-Attention Guided Feature Fusion Network for Change Detection in Remote Sensing Images
    Ma, Jinming
    Shi, Gang
    Li, Yanxiang
    Zhao, Ziyu
    SENSORS, 2022, 22 (03)
  • [4] Frustum FusionNet: Amodal 3D Object Detection with Multi-Modal Feature Fusion
    Zuo, Liangyu
    Li, Yaochen
    Han, Mengtao
    Li, Qiao
    Liu, Yuehu
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 2746 - 2751
  • [5] Dual-domain deformable feature fusion for multi-modal 3D object detection
    Wang, Shihao
    Deng, Tao
    Journal of Electronic Imaging, 2024, 33 (06)
  • [6] FuseNet: a multi-modal feature fusion network for 3D shape classification
    Zhao, Xin
    Chen, Yinhuang
    Yang, Chengzhuan
    Fang, Lincong
    VISUAL COMPUTER, 2024, : 2973 - 2985
  • [7] AMFF-net: adaptive multi-modal feature fusion network for image classification
    Liu, Wei
    Lu, Xiaobo
    Wei, Yun
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (06) : 17069 - 17091
  • [8] AMFF-net: adaptive multi-modal feature fusion network for image classification
    Wei Liu
    Xiaobo Lu
    Yun Wei
    Multimedia Tools and Applications, 2024, 83 : 17069 - 17091
  • [9] Adaptive Feature Fusion for Multi-modal Entity Alignment
    Guo H.
    Li X.-Y.
    Tang J.-Y.
    Guo Y.-M.
    Zhao X.
    Zidonghua Xuebao/Acta Automatica Sinica, 2024, 50 (04): : 758 - 770
  • [10] Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 628 - 644