VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

被引:2292
|
作者
Zhou, Yin [1 ]
Tuzel, Oncel [1 ]
机构
[1] Apple Inc, Cupertino, CA 95014 USA
关键词
REPRESENTATION;
D O I
10.1109/CVPR.2018.00472
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Accurate detection of objects in 31) point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. 7b interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird's eve view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and hounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 31) voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VIE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KIITI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR.
引用
收藏
页码:4490 / 4499
页数:10
相关论文
共 50 条
  • [1] End-to-End 3D Object Detection using LiDAR Point Cloud
    Raut, Gaurav
    Patole, Advait
    [J]. 2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
  • [2] SparseDet: Towards End-to-End 3D Object Detection
    Han, Jianhong
    Wan, Zhaoyi
    Liu, Zhe
    Feng, Jie
    Zhou, Bingfeng
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 781 - 792
  • [3] An End-to-End Transformer Model for 3D Object Detection
    Misra, Ishan
    Girdhar, Rohit
    Joulin, Armand
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2886 - 2897
  • [4] End-to-End 3D Point Cloud Learning for Registration Task Using Virtual Correspondences
    Wei, Huanshu
    Qiao, Zhijian
    Liu, Zhe
    Suo, Chuanzhe
    Yin, Peng
    Shen, Yueling
    Li, Haoang
    Wang, Hesheng
    [J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 2678 - 2683
  • [5] End-to-End Learning the Partial Permutation Matrix for Robust 3D Point Cloud Registration
    Zhang, Zhiyuan
    Sun, Jiadai
    Dai, Yuchao
    Zhou, Dingfu
    Song, Xibin
    He, Mingyi
    [J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3399 - 3407
  • [6] End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
    Zhou, Yin
    Sun, Pei
    Zhang, Yu
    Anguelov, Dragomir
    Gao, Jiyang
    Ouyang, Tom
    Guo, James
    Ngiam, Jiquan
    Vasudevan, Vijay
    [J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
  • [7] A survey on end-to-end point cloud learning
    Tang, Xikai
    Huang, Fangzheng
    Li, Chao
    Ban, Dayan
    [J]. IET IMAGE PROCESSING, 2023, 17 (05) : 1307 - 1321
  • [8] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
    Qian, Rui
    Garg, Divyansh
    Wang, Yan
    You, Yurong
    Belongie, Serge
    Hariharan, Bharath
    Campbell, Mark
    Weinberger, Kilian Q.
    Chao, Wei-Lun
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5880 - 5889
  • [9] 3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object Tracking Using Raw Point Cloud
    Fang, Zheng
    Zhou, Sifan
    Cui, Yubo
    Scherer, Sebastian
    [J]. IEEE SENSORS JOURNAL, 2021, 21 (04) : 4995 - 5011
  • [10] YOLO3D: End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud
    Ali, Waleed
    Abdelkarim, Sherif
    Zidan, Mahmoud
    Zahran, Mohamed
    El Sallab, Ahmad
    [J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 716 - 728