VPFNet: Improving 3D Object Detection With Virtual Point Based LiDAR and Stereo Data Fusion

被引:71
|
作者
Zhu, Hanqi [1 ]
Deng, Jiajun [2 ]
Zhang, Yu [1 ]
Ji, Jianmin [1 ]
Mao, Qiuyu [1 ]
Li, Houqiang [2 ]
Zhang, Yanyong [1 ]
机构
[1] Univ Sci & Technol China, Sch Comp Sci & Technol, Hefei 230027, Peoples R China
[2] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Peoples R China
关键词
3D object detection; multiple sensors; point clouds; stereo images; R-CNN;
D O I
10.1109/TMM.2022.3189778
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
It has been well recognized that fusing the complementary information from depth-aware LiDAR point clouds and semantic-rich stereo images would benefit 3D object detection. Nevertheless, it is non-trivial to explore the inherently unnatural interaction between sparse 3D points and dense 2D pixels. To ease this difficulty, the recent approaches generally project the 3D points onto the 2D image plane to sample the image data and then aggregate the data at the points. However, these approaches often suffer from the mismatch between the resolution of point clouds and RGB images, leading to sub-optimal performance. Specifically, taking the sparse points as the multi-modal data aggregation locations causes severe information loss for high-resolution images, which in turn undermines the effectiveness of multi-sensor fusion. In this paper, we present VPFNet -a new architecture that cleverly aligns and aggregates the point cloud and image data at the "virtual" points. Particularly, with their density lying between that of the 3D points and 2D pixels, the virtual points can nicely bridge the resolution gap between the two sensors, and thus preserve more information for processing. Moreover, we also investigate the data augmentation techniques that can be applied to both point clouds and RGB images, as the data augmentation has made non-negligible contribution towards 3D object detectors to date. We have conducted extensive experiments on KITTI dataset, and have observed good performance compared to the state-of-the-art methods. Remarkably, our VPFNet achieves 83.21% moderate $AP_{3D}$ and 91.86% moderate $AP_{BEV}$ on the KITTI test set. The network design also takes computation efficiency into consideration - we can achieve a FPS of 15 on a single NVIDIA RTX 2080Ti GPU.
引用
收藏
页码:5291 / 5304
页数:14
相关论文
共 50 条
  • [31] Research on 3D Object Detection Based on Laser Point Cloud and Image Fusion
    Liu Y.
    Yu F.
    Zhang X.
    Chen Z.
    Qin D.
    Jixie Gongcheng Xuebao/Journal of Mechanical Engineering, 2022, 58 (24): : 289 - 299
  • [32] SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection
    Qin, Yiran
    Wang, Chaoqun
    Kang, Zijian
    Ma, Ningning
    Li, Zhen
    Zhang, Ruimao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21957 - 21967
  • [33] Point Density-Aware Voxels for LiDAR 3D Object Detection
    Hu, Jordan S. K.
    Kuai, Tianshu
    Waslander, Steven L.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 8459 - 8468
  • [34] Boosting Lidar 3D Object Detection with Point Cloud Semantic Segmentation
    Zhang, Xuchong
    Min, Chong
    Jia, Yijie
    Chen, Liming
    Zhang, Jingmin
    Sun, Hongbin
    2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 7614 - 7621
  • [35] Monocular 3D Object Detection with Pseudo-LiDAR Point Cloud
    Weng, Xinshuo
    Kitani, Kris
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 857 - 866
  • [36] LiDAR-Based 3D Temporal Object Detection via Motion-Aware LiDAR Feature Fusion
    Park, Gyuhee
    Koh, Junho
    Kim, Jisong
    Moon, Jun
    Choi, Jun Won
    SENSORS, 2024, 24 (14)
  • [37] 3D Vehicle Detection Based on LiDAR and Camera Fusion
    Cai, Yingfeng
    Zhang, Tiantian
    Wang, Hai
    Li, Yicheng
    Liu, Qingchao
    Chen, Xiaobo
    AUTOMOTIVE INNOVATION, 2019, 2 (04) : 276 - 283
  • [38] 3D Vehicle Detection Based on LiDAR and Camera Fusion
    Yingfeng Cai
    Tiantian Zhang
    Hai Wang
    Yicheng Li
    Qingchao Liu
    Xiaobo Chen
    Automotive Innovation, 2019, 2 : 276 - 283
  • [39] Investigating 3D object detection using stereo camera and LiDAR fusion with bird's-eye view representation
    Nie, Xin
    Zhu, Lin
    He, Zhicheng
    Cheng, Aiguo
    Zhong, Shengshi
    Li, Eric
    NEUROCOMPUTING, 2025, 620
  • [40] InterFusion: Interaction-based 4D Radar and LiDAR Fusion for 3D Object Detection
    Wang, Li
    Zhang, Xinyu
    Xv, Baowei
    Zhang, Jinzhao
    Fu, Rong
    Wang, Xiaoyu
    Zhu, Lei
    Ren, Haibing
    Lu, Pingping
    Li, Jun
    Liu, Huaping
    2022 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2022, : 12247 - 12253