VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection

被引：2292

作者：

Zhou, Yin ^{[1
]}

Tuzel, Oncel ^{[1
]}

机构：

[1] Apple Inc, Cupertino, CA 95014 USA

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

关键词：

REPRESENTATION;

D O I：

10.1109/CVPR.2018.00472

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Accurate detection of objects in 31) point clouds is a central problem in many applications, such as autonomous navigation, housekeeping robots, and augmented/virtual reality. 7b interface a highly sparse LiDAR point cloud with a region proposal network (RPN), most existing efforts have focused on hand-crafted feature representations, for example, a bird's eve view projection. In this work, we remove the need of manual feature engineering for 3D point clouds and propose VoxelNet, a generic 3D detection network that unifies feature extraction and hounding box prediction into a single stage, end-to-end trainable deep network. Specifically, VoxelNet divides a point cloud into equally spaced 31) voxels and transforms a group of points within each voxel into a unified feature representation through the newly introduced voxel feature encoding (VIE) layer. In this way, the point cloud is encoded as a descriptive volumetric representation, which is then connected to a RPN to generate detections. Experiments on the KIITI car detection benchmark show that VoxelNet outperforms the state-of-the-art LiDAR based 3D detection methods by a large margin. Furthermore, our network learns an effective discriminative representation of objects with various geometries, leading to encouraging results in 3D detection of pedestrians and cyclists, based on only LiDAR.

引用

页码：4490 / 4499

页数：10

共 50 条

[1] End-to-End 3D Object Detection using LiDAR Point Cloud
Raut, Gaurav
Patole, Advait
[J]. 2024 IEEE 3RD INTERNATIONAL CONFERENCE ON COMPUTING AND MACHINE INTELLIGENCE, ICMI 2024, 2024,
[2] SparseDet: Towards End-to-End 3D Object Detection
Han, Jianhong
Wan, Zhaoyi
Liu, Zhe
Feng, Jie
Zhou, Bingfeng
[J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 4, 2022, : 781 - 792
[3] An End-to-End Transformer Model for 3D Object Detection
Misra, Ishan
Girdhar, Rohit
Joulin, Armand
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2886 - 2897
[4] End-to-End 3D Point Cloud Learning for Registration Task Using Virtual Correspondences
Wei, Huanshu
Qiao, Zhijian
Liu, Zhe
Suo, Chuanzhe
Yin, Peng
Shen, Yueling
Li, Haoang
Wang, Hesheng
[J]. 2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, : 2678 - 2683
[5] End-to-End Learning the Partial Permutation Matrix for Robust 3D Point Cloud Registration
Zhang, Zhiyuan
Sun, Jiadai
Dai, Yuchao
Zhou, Dingfu
Song, Xibin
He, Mingyi
[J]. THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 3399 - 3407
[6] End-to-End Multi-View Fusion for 3D Object Detection in LiDAR Point Clouds
Zhou, Yin
Sun, Pei
Zhang, Yu
Anguelov, Dragomir
Gao, Jiyang
Ouyang, Tom
Guo, James
Ngiam, Jiquan
Vasudevan, Vijay
[J]. CONFERENCE ON ROBOT LEARNING, VOL 100, 2019, 100
[7] A survey on end-to-end point cloud learning
Tang, Xikai
Huang, Fangzheng
Li, Chao
Ban, Dayan
[J]. IET IMAGE PROCESSING, 2023, 17 (05) : 1307 - 1321
[8] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
Qian, Rui
Garg, Divyansh
Wang, Yan
You, Yurong
Belongie, Serge
Hariharan, Bharath
Campbell, Mark
Weinberger, Kilian Q.
Chao, Wei-Lun
[J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 5880 - 5889
[9] 3D-SiamRPN: An End-to-End Learning Method for Real-Time 3D Single Object Tracking Using Raw Point Cloud
Fang, Zheng
Zhou, Sifan
Cui, Yubo
Scherer, Sebastian
[J]. IEEE SENSORS JOURNAL, 2021, 21 (04) : 4995 - 5011
[10] YOLO3D: End-to-End Real-Time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud
Ali, Waleed
Abdelkarim, Sherif
Zidan, Mahmoud
Zahran, Mohamed
El Sallab, Ahmad
[J]. COMPUTER VISION - ECCV 2018 WORKSHOPS, PT III, 2019, 11131 : 716 - 728

← 1 2 3 4 5 →