Virtual Sparse Convolution for Multimodal 3D Object Detection

被引:70
|
作者
Wu, Hai [1 ]
Wen, Chenglu [1 ]
Shi, Shaoshuai [2 ]
Li, Xin [3 ]
Wang, Cheng [1 ]
机构
[1] Xiamen Univ, Xiamen, Peoples R China
[2] Max Planck Inst, Munich, Germany
[3] Texas A&M Univ, College Stn, TX 77843 USA
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.02074
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, virtual/pseudo-point-based 3D object detection that seamlessly fuses RGB images and LiDAR data by depth completion has gained great attention. However, virtual points generated from an image are very dense, introducing a huge amount of redundant computation during detection. Meanwhile, noises brought by inaccurate depth completion significantly degrade detection precision. This paper proposes a fast yet effective backbone, termed VirConvNet, based on a new operator VirConv (Virtual Sparse Convolution), for virtual-point-based 3D object detection. VirConv consists of two key designs: (1) StVD (Stochastic Voxel Discard) and (2) NRConv (Noise-Resistant Sub-manifold Convolution). StVD alleviates the computation problem by discarding large amounts of nearby redundant voxels. NRConv tackles the noise problem by encoding voxel features in both 2D image and 3D LiDAR space. By integrating VirConv, we first develop an efficient pipeline VirConv-L based on an early fusion design. Then, we build a high-precision pipeline VirConv-T based on a transformed refinement scheme. Finally, we develop a semi-supervised pipeline VirConv-S based on a pseudo-label framework. On the KITTI car 3D detection test leaderboard, our VirConv-L achieves 85% AP with a fast running speed of 56ms. Our VirConv-T and VirConv-S attains a high-precision of 86.3% and 87.2% AP, and currently rank 2nd and 1st(1), respectively. The code is available at https://github.com/hailanyi/VirConv.
引用
收藏
页码:21653 / 21662
页数:10
相关论文
共 50 条
  • [21] FSD V2: Improving Fully Sparse 3D Object Detection With Virtual Voxels
    Fan, Lue
    Wang, Feng
    Wang, Naiyan
    Zhang, Zhaoxiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (02) : 1279 - 1292
  • [22] MonoDCN: Monocular 3D object detection based on dynamic convolution
    Qu, Shenming
    Yang, Xinyu
    Gao, Yiming
    Liang, Shengbin
    PLOS ONE, 2022, 17 (10):
  • [23] 3D object detection based on sparse convolution neural network and feature fusion for autonomous driving in smart cities
    Wang, Lei
    Fan, Xiaoyun
    Chen, Jiahao
    Cheng, Jun
    Tan, Jun
    Ma, Xiaoliang
    SUSTAINABLE CITIES AND SOCIETY, 2020, 54 (54)
  • [24] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [25] Real-Time Multimodal 3D Object Detection with Transformers
    Liu, Hengsong
    Duan, Tongle
    WORLD ELECTRIC VEHICLE JOURNAL, 2024, 15 (07):
  • [26] VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking
    Chen, Yukang
    Liu, Jianhui
    Zhang, Xiangyu
    Qi, Xiaojuan
    Jia, Jiaya
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21674 - 21683
  • [27] MVX-Net: Multimodal VoxelNet for 3D Object Detection
    Sindagi, Vishwanath A.
    Zhou, Yin
    Tuzel, Oncel
    2019 INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2019, : 7276 - 7282
  • [28] Multimodal Sparse Features for Object Detection
    Haker, Martin
    Martinetz, Thomas
    Barth, Erhardt
    ARTIFICIAL NEURAL NETWORKS - ICANN 2009, PT II, 2009, 5769 : 923 - 932
  • [29] Sparse2Dense: Learning to Densify 3D Features for 3D Object Detection
    Wang, Tianyu
    Hu, Xiaowei
    Liu, Zhengzhe
    Fu, Chi-Wing
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [30] Monocular 3D Object Detection Utilizing Auxiliary Learning With Deformable Convolution
    Chen, Jiun-Han
    Shieh, Jeng-Lun
    Haq, Muhamad Amirul
    Ruan, Shanq-Jang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (03) : 2424 - 2436