MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

被引:8
|
作者
Wu, Zizhang [1 ]
Gan, Yuanzhu [1 ]
Wang, Lei [1 ]
Chen, Guilian [1 ]
Pu, Jian [2 ]
机构
[1] Zongmu Technol, Shanghai, Peoples R China
[2] Fudan Univ, Shanghai, Peoples R China
关键词
D O I
10.1109/ICRA48891.2023.10161442
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center's depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the performance of clues for spatial 3D detection tasks. To alleviate this, we propose MonoPGC, a novel end-to-end Monocular 3D object detection framework with rich Pixel Geometry Contexts. We introduce the pixel depth estimation as our auxiliary task and design depth cross-attention pyramid module (DCPM) to inject local and global depth geometry knowledge into visual features. In addition, we present the depth-space-aware transformer (DSAT) to integrate 3D space position and depth-aware features efficiently. Besides, we design a novel depth-gradient positional encoding (DGPE) to bring more distinct pixel geometry contexts into the transformer for better object detection. Extensive experiments demonstrate that our method achieves the state-of-the-art performance on the KITTI dataset.
引用
收藏
页码:4842 / 4849
页数:8
相关论文
共 50 条
  • [1] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
    Liu, Xianpeng
    Xue, Nan
    Wu, Tianfu
    THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1810 - 1818
  • [2] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
    Lu, Yan
    Ma, Xinzhu
    Yang, Lei
    Zhang, Tianzhu
    Liu, Yating
    Chu, Qi
    Yan, Junjie
    Ouyang, Wanli
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3091 - 3101
  • [3] Geometry-based Distance Decomposition for Monocular 3D Object Detection
    Shi, Xuepeng
    Ye, Qi
    Chen, Xiaozhi
    Chen, Chuangrong
    Chen, Zhixiang
    Kim, Tae-Kyun
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15152 - 15161
  • [4] Geometry-Guided Domain Generalization for Monocular 3D Object Detection
    Yang, Fan
    Chen, Hui
    He, Yuwei
    Zhao, Sicheng
    Zhang, Chenghao
    Ni, Kai
    Ding, Guiguang
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6467 - 6476
  • [5] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [6] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
  • [7] Monocular 3D Object Detection for Autonomous Driving
    Chen, Xiaozhi
    Kundu, Kaustav
    Zhang, Ziyu
    Ma, Huimin
    Fidler, Sanja
    Urtasun, Raquel
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
  • [8] Dimension Embeddings for Monocular 3D Object Detection
    Zhang, Yunpeng
    Zheng, Wenzhao
    Zhu, Zheng
    Huang, Guan
    Du, Dalong
    Zhou, Jie
    Lu, Jiwen
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
  • [9] Learning Occupancy for Monocular 3D Object Detection
    Peng, Liang
    Xu, Junkai
    Cheng, Haoran
    Yang, Zheng
    Wu, Xiaopei
    Qian, Wei
    Wang, Wenxiao
    Wu, Boxi
    Cai, Deng
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10281 - 10292
  • [10] Uncertainty Prediction for Monocular 3D Object Detection
    Mun, Junghwan
    Choi, Hyukdoo
    SENSORS, 2023, 23 (12)