VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion

被引:71
|
作者
Zhu, Hanqi [1 ]
Deng, Jiajun [2 ]
Zhang, Yu [1 ]
Ji, Jianmin [1 ]
Mao, Qiuyu [1 ]
Li, Houqiang [2 ]
Zhang, Yanyong [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Peoples R China
[2] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Peoples R China
关键词
3D object detection; multiple sensors; point clouds; stereo images; R-CNN;
D O I
10.1109/TMM.2022.3189778
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is non-trivial to explore the inherently unnatural interaction between sparse 3D points and dense 2D pixels. To ease this difficulty, the recent approaches generally project the 3D points onto the 2D image plane to sample the image data and then aggregate the data at the points. However, these approaches often suffer from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance. Specifically, taking the sparse points as the multi-modal data aggregation locations causes severe information loss for high-resolution images, which in turn undermines the effectiveness of multi-sensor fusion. In this paper, we present VPFNet -a new architecture that cleverly aligns and aggregates the point cloud and image data at the "virtual" points. Particularly, with their density lying between that of the 3D points and 2D pixels, the virtual points can nicely bridge the resolution gap between the two sensors, and thus preserve more information for processing. Moreover, we also investigate the data augmentation techniques that can be applied to both point clouds and RGB images, as the data augmentation has made non-negligible contribution towards 3D object detectors to date. We have conducted extensive experiments on KITTI dataset, and have observed good performance compared to the state-of-the-art methods. Remarkably, our VPFNet achieves 83.21% moderate $AP_{3D}$ and 91.86% moderate $AP_{BEV}$ on the KITTI test set. The network design also takes computation efficiency into consideration - we can achieve a FPS of 15 on a single NVIDIA RTX 2080Ti GPU.
引用
收藏
页码:5291 / 5304
页数:14
相关论文
共 50 条
  • [21] CL3D: Camera-LiDAR 3D Object Detection With Point Feature Enhancement and Point-Guided Fusion
    Lin, Chunmian
    Tian, Daxin
    Duan, Xuting
    Zhou, Jianshan
    Zhao, Dezong
    Cao, Dongpu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (10) : 18040 - 18050
  • [22] BEV Space 3D Object Detection Algorithm Based on Fusion of Infrared Camera and LiDAR
    Wang Wuyue
    Xu Zhaofei
    Qu Chunyan
    Lin Ying
    Chen Yufeng
    Liao Jian
    ACTA PHOTONICA SINICA, 2024, 53 (01)
  • [23] LiDAR 3D Object Detection Based on Improved PointRCNN
    Gao, Han
    Chen, Ying
    Ni, Lizheng
    Deng, Xiuhan
    Zhong, Kai
    Yan, Chengzhi
    LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (22)
  • [24] Silhouette and stereo fusion for 3D object modeling
    Esteban, CH
    Schmitt, F
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2004, 96 (03) : 367 - 392
  • [25] Silhouette and stereo fusion for 3D object modeling
    Esteban, CH
    Schmitt, F
    FOURTH INTERNATIONAL CONFERENCE ON 3-D DIGITAL IMAGING AND MODELING, PROCEEDINGS, 2003, : 46 - 53
  • [26] SemanticAnchors: Sequential Fusion using Lidar Point Cloud and Anchors with Semantic Annotations for 3D Object Detection
    Gao, Zhentong
    Wang, Qiantong
    Pan, Zongxu
    Long, Hui
    Hu, Yuxin
    Li, Zheng
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1128 - 1133
  • [27] Point cloud 3D object detection algorithm based on local information fusion
    Zhang, Linjie
    Chai, Zhilei
    Wang, Ning
    Zhejiang Daxue Xuebao (Gongxue Ban)/Journal of Zhejiang University (Engineering Science), 2024, 58 (11): : 2219 - 2229
  • [28] Multi-Layer Fusion 3D Object Detection via Lidar Point Cloud and Camera Image
    Guo, Yuhao
    Hu, Hui
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [29] End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
    Zhou, Yin
    Sun, Pei
    Zhang, Yu
    Anguelov, Dragomir
    Gao, Jiyang
    Ouyang, Tom
    Guo, James
    Ngiam, Jiquan
    Vasudevan, Vijay
    CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [30] 3D object detection based on fusion of point cloud and image by mutual attention
    Chen J.-Y.
    Bai T.-Y.
    Zhao L.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2021, 29 (09): : 2247 - 2254