MonoPGC: Monocular 3D Object Detection with Pixel Geometry Contexts

被引：8

作者：

Wu, Zizhang ^{[1
]}

Gan, Yuanzhu ^{[1
]}

Wang, Lei ^{[1
]}

Chen, Guilian ^{[1
]}

Pu, Jian ^{[2
]}

机构：

[1] Zongmu Technol, Shanghai, Peoples R China

[2] Fudan Univ, Shanghai, Peoples R China

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA | 2023年

关键词：

D O I：

10.1109/ICRA48891.2023.10161442

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Monocular 3D object detection reveals an economical but challenging task in autonomous driving. Recently center-based monocular methods have developed rapidly with a great trade-off between speed and accuracy, where they usually depend on the object center's depth estimation via 2D features. However, the visual semantic features without sufficient pixel geometry information, may affect the performance of clues for spatial 3D detection tasks. To alleviate this, we propose MonoPGC, a novel end-to-end Monocular 3D object detection framework with rich Pixel Geometry Contexts. We introduce the pixel depth estimation as our auxiliary task and design depth cross-attention pyramid module (DCPM) to inject local and global depth geometry knowledge into visual features. In addition, we present the depth-space-aware transformer (DSAT) to integrate 3D space position and depth-aware features efficiently. Besides, we design a novel depth-gradient positional encoding (DGPE) to bring more distinct pixel geometry contexts into the transformer for better object detection. Extensive experiments demonstrate that our method achieves the state-of-the-art performance on the KITTI dataset.

引用

页码：4842 / 4849

页数：8

共 50 条

[1] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
Liu, Xianpeng
Xue, Nan
Wu, Tianfu
THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, : 1810 - 1818
[2] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
Lu, Yan
Ma, Xinzhu
Yang, Lei
Zhang, Tianzhu
Liu, Yating
Chu, Qi
Yan, Junjie
Ouyang, Wanli
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 3091 - 3101
[3] Geometry-based Distance Decomposition for Monocular 3D Object Detection
Shi, Xuepeng
Ye, Qi
Chen, Xiaozhi
Chen, Chuangrong
Chen, Zhixiang
Kim, Tae-Kyun
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15152 - 15161
[4] Geometry-Guided Domain Generalization for Monocular 3D Object Detection
Yang, Fan
Chen, Hui
He, Yuwei
Zhao, Sicheng
Zhang, Chenghao
Ni, Kai
Ding, Guiguang
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 6, 2024, : 6467 - 6476
[5] Aerial Monocular 3D Object Detection
Hu, Yue
Fang, Shaoheng
Xie, Weidi
Chen, Siheng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
[6] Disentangling Monocular 3D Object Detection
Simonelli, Andrea
Bulo, Samuel Rota
Porzi, Lorenzo
Lopez-Antequera, Manuel
Kontschieder, Peter
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999
[7] Monocular 3D Object Detection for Autonomous Driving
Chen, Xiaozhi
Kundu, Kaustav
Zhang, Ziyu
Ma, Huimin
Fidler, Sanja
Urtasun, Raquel
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 2147 - 2156
[8] Dimension Embeddings for Monocular 3D Object Detection
Zhang, Yunpeng
Zheng, Wenzhao
Zhu, Zheng
Huang, Guan
Du, Dalong
Zhou, Jie
Lu, Jiwen
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1579 - 1588
[9] Learning Occupancy for Monocular 3D Object Detection
Peng, Liang
Xu, Junkai
Cheng, Haoran
Yang, Zheng
Wu, Xiaopei
Qian, Wei
Wang, Wenxiao
Wu, Boxi
Cai, Deng
2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2024, : 10281 - 10292
[10] Uncertainty Prediction for Monocular 3D Object Detection
Mun, Junghwan
Choi, Hyukdoo
SENSORS, 2023, 23 (12)

← 1 2 3 4 5 →