Virtual Sparse Convolution for Multimodal 3D Object Detection

被引：70

作者：

Wu, Hai ^{[1
]}

Wen, Chenglu ^{[1
]}

Shi, Shaoshuai ^{[2
]}

Li, Xin ^{[3
]}

Wang, Cheng ^{[1
]}

机构：

[1] Xiamen Univ, Xiamen, Peoples R China

[2] Max Planck Inst, Munich, Germany

[3] Texas A&M Univ, College Stn, TX 77843 USA

来源：

2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2023年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/CVPR52729.2023.02074

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Recently, virtual/pseudo-point-based 3D object detection that seamlessly fuses RGB images and LiDAR data by depth completion has gained great attention. However, virtual points generated from an image are very dense, introducing a huge amount of redundant computation during detection. Meanwhile, noises brought by inaccurate depth completion significantly degrade detection precision. This paper proposes a fast yet effective backbone, termed VirConvNet, based on a new operator VirConv (Virtual Sparse Convolution), for virtual-point-based 3D object detection. VirConv consists of two key designs: (1) StVD (Stochastic Voxel Discard) and (2) NRConv (Noise-Resistant Sub-manifold Convolution). StVD alleviates the computation problem by discarding large amounts of nearby redundant voxels. NRConv tackles the noise problem by encoding voxel features in both 2D image and 3D LiDAR space. By integrating VirConv, we first develop an efficient pipeline VirConv-L based on an early fusion design. Then, we build a high-precision pipeline VirConv-T based on a transformed refinement scheme. Finally, we develop a semi-supervised pipeline VirConv-S based on a pseudo-label framework. On the KITTI car 3D detection test leaderboard, our VirConv-L achieves 85% AP with a fast running speed of 56ms. Our VirConv-T and VirConv-S attains a high-precision of 86.3% and 87.2% AP, and currently rank 2nd and 1st(1), respectively. The code is available at https://github.com/hailanyi/VirConv.

引用

页码：21653 / 21662

页数：10

共 50 条

[41] Video Object Segmentation with 3D Convolution Network
Tang, Huiyun
Tao, Pin
Ma, Rui
Shi, Yuanchun
ICCCV 2019: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON CONTROL AND COMPUTER VISION, 2019, : 28 - 32
[42] To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
Chai, Yuning
Sun, Pei
Ngiam, Jiquan
Wang, Weiyue
Caine, Benjamin
Vasudevan, Vijay
Zhang, Xiao
Anguelov, Dragomir
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15995 - 16004
[43] Virtual 3D calibration object for the purpose of 3D reconstruction
Pribanic, T
Cifrek, M
Tonkovic, S
MEDICON 2001: PROCEEDINGS OF THE INTERNATIONAL FEDERATION FOR MEDICAL & BIOLOGICAL ENGINEERING, PTS 1 AND 2, 2001, : 655 - 657
[44] LVP: Leverage Virtual Points in Multimodal Early Fusion for 3-D Object Detection
Chen, Yidong
Cai, Guorong
Song, Ziying
Liu, Zhaoliang
Zeng, Binghui
Li, Jonathan
Wang, Zongyue
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2025, 63
[45] Homogenous multimodal 3D object detection based on deformable Transformer and attribute dependencies
Dong, Yue
Li, Xingfeng
He, Hua
PROCEEDINGS OF 2024 3RD INTERNATIONAL CONFERENCE ON CYBER SECURITY, ARTIFICIAL INTELLIGENCE AND DIGITAL ECONOMY, CSAIDE 2024, 2024, : 346 - 351
[46] MMFG: Multimodal-based Mutual Feature Gating 3D Object Detection
Xu, Wanpeng
Fu, Zhipeng
JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2024, 110 (02)
[47] DMFF: dual-way multimodal feature fusion for 3D object detection
Dong, Xiaopeng
Di, Xiaoguang
Wang, Wenzhuang
SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (01) : 455 - 463
[48] CAF-RCNN: multimodal 3D object detection with cross-attention
Liu, Junting
Liu, Deer
Zhu, Lei
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (19) : 6131 - 6146
[49] Multimodal feature adaptive fusion for anchor-free 3D object detection
Wu, Yanli
Wang, Junyin
Li, Hui
Ai, Xiaoxue
Li, Xiao
APPLIED INTELLIGENCE, 2025, 55 (07)
[50] Singular and Multimodal Techniques of 3D Object Detection: Constraints, Advancements and Research Direction
Karim, Tajbia
Mahayuddin, Zainal Rasyid
Hasan, Mohammad Kamrul
APPLIED SCIENCES-BASEL, 2023, 13 (24):

← 1 2 3 4 5 →