Is Pseudo-Lidar needed for Monocular 3D Object detection?

被引:88
|
作者
Park, Dennis [1 ]
Ambrus, Rares [1 ]
Guizilini, Vitor [1 ]
Li, Jie [1 ]
Gaidon, Adrien [1 ]
机构
[1] Toyota Res Inst, Cambridge, MA 02139 USA
关键词
D O I
10.1109/ICCV48922.2021.00313
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent progress in 3D object detection from single images leverages monocular depth estimation as a way to produce 3D pointclouds, turning cameras into pseudo-lidar sensors. These two-stage detectors improve with the accuracy of the intermediate depth estimation network, which can itself be improved without manual labels via large-scale self-supervised learning. However, they tend to suffer from overfitting more than end-to-end methods, are more complex, and the gap with similar lidar-based detectors remains significant. In this work, we propose an end-to-end, single stage, monocular 3D object detector, DD3D, that can benefit from depth pre-training like pseudo-lidar methods, but without their limitations. Our architecture is designed for effective information transfer between depth estimation and 3D detection, allowing us to scale with the amount of unlabeled pre-training data. Our method achieves state-of-the-art results on two challenging benchmarks, with 16:34% and 9:28% AP for Cars and Pedestrians (respectively) on the KITTI-3D benchmark, and 41.5% mAP on NuScenes.
引用
下载
收藏
页码:3122 / 3132
页数:11
相关论文
共 50 条
  • [21] Probabilistic instance shape reconstruction with sparse LiDAR for monocular 3D object detection
    Ji, Chaofeng
    Wu, Han
    Liu, Guizhong
    NEUROCOMPUTING, 2023, 529 : 92 - 100
  • [22] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [23] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
  • [24] Pseudo-LiDAR Point Cloud Interpolation Based on 3D Motion Representation and Spatial Supervision
    Liu, Haojie
    Liao, Kang
    Lin, Chunyu
    Zhao, Yao
    Guo, Yulan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (07) : 6379 - 6389
  • [25] Monocular 3D Object Detection Based on Pseudo Multimodal Information Extraction and Keypoint Estimation
    Zhao, Dan
    Ji, Chaofeng
    Liu, Guizhong
    APPLIED SCIENCES-BASEL, 2023, 13 (03):
  • [26] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [27] Dimension Embeddings for Monocular 3D Object Detection
    Zhang, Yunpeng
    Zheng, Wenzhao
    Zhu, Zheng
    Huang, Guan
    Du, Dalong
    Zhou, Jie
    Lu, Jiwen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
  • [28] Multivariate Probabilistic Monocular 3D Object Detection
    Shi, Xuepeng
    Chen, Zhixiang
    Kim, Tae-Kyun
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 4270 - 4279
  • [29] Uncertainty Prediction for Monocular 3D Object Detection
    Mun, Junghwan
    Choi, Hyukdoo
    SENSORS, 2023, 23 (12)
  • [30] Homography Loss for Monocular 3D Object Detection
    Gu, Jiaqi
    Wu, Bojian
    Fan, Lubin
    Huang, Jianqiang
    Cao, Shen
    Xiang, Zhiyu
    Hua, Xian-Sheng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1070 - 1079