MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

被引:42
|
作者
Huang, Kuan-Chih [1 ]
Wu, Tsung-Han [1 ]
Su, Hung-Ting [1 ]
Hsu, Winston H. [1 ,2 ]
机构
[1] Natl Taiwan Univ, Taipei, Taiwan
[2] Mobile Drive Technol, New Taipei, Taiwan
关键词
D O I
10.1109/CVPR52688.2022.00398
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Monocular 3D object detection is an important yet challenging task in autonomous driving. Some existing methods leverage depth information from an off-the-shelf depth estimator to assist 3D detection, but suffer from the additional computational burden and achieve limited performance caused by inaccurate depth priors. To alleviate this, we propose MonoDTR, a novel end-to-end depth-aware transformer network for monocular 3D object detection. It mainly consists of two components: (1) the Depth-Aware Feature Enhancement (DFE) module that implicitly learns depth-aware features with auxiliary supervision without requiring extra computation, and (2) the Depth-Aware Transformer (DTR) module that globally integrates context- and depth-aware features. Moreover, different from conventional pixel-wise positional encodings, we introduce a novel depth positional encoding (DPE) to inject depth positional hints into transformers. Our proposed depth-aware modules can be easily plugged into existing image-only monocular 3D object detectors to improve the performance. Extensive experiments on the KITTI dataset demonstrate that our approach outperforms previous state-of-the-art monocularbased methods and achieves real-time detection.
引用
收藏
页码:4002 / 4011
页数:10
相关论文
共 50 条
  • [21] DEPTH-ASSISTED JOINT DETECTION NETWORK FOR MONOCULAR 3D OBJECT DETECTION
    Lei, Jianjun
    Guo, Tingyi
    Peng, Bo
    Yu, Chuanbo
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2204 - 2208
  • [22] Monocular 3D object detection with thermodynamic loss and decoupled instance depth
    Liu, Gang
    Xie, Xiaoxiao
    Yu, Qingchen
    [J]. CONNECTION SCIENCE, 2024, 36 (01)
  • [23] Depth-discriminative Metric Learning for Monocular 3D Object Detection
    Choi, Wonhyeok
    Shin, Mingyu
    Im, Sunghoon
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [24] Learning Depth-Guided Convolutions for Monocular 3D Object Detection
    Ng, Mingyu
    Huo, Yuqi
    Yi, Hongwei
    Wang, Zhe
    Shi, Jianping
    Lu, Zhiwu
    Luo, Ping
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4306 - 4315
  • [25] Depth dynamic center difference convolutions for monocular 3D object detection
    Wu, Xinyu
    Ma, Dongliang
    Qu, Xin
    Jiang, Xin
    Zeng, Dan
    [J]. NEUROCOMPUTING, 2023, 520 : 73 - 81
  • [26] Depth-aware inverted refinement network for RGB-D salient object detection
    Gao, Lina
    Liu, Bing
    Fu, Ping
    Xu, Mingzhu
    [J]. NEUROCOMPUTING, 2023, 518 : 507 - 522
  • [27] DEPTH-AWARE OBJECT INSTANCE SEGMENTATION
    Ye, Linwei
    Liu, Zhi
    Wang, Yang
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 325 - 329
  • [28] Boosting Monocular 3D Object Detection With Object-Centric Auxiliary Depth Supervision
    Kim, Youngseok
    Kim, Sanmin
    Sim, Sangmin
    Choi, Jun Won
    Kum, Dongsuk
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (02) : 1801 - 1813
  • [29] Aerial Monocular 3D Object Detection
    Hu, Yue
    Fang, Shaoheng
    Xie, Weidi
    Chen, Siheng
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (04) : 1959 - 1966
  • [30] Disentangling Monocular 3D Object Detection
    Simonelli, Andrea
    Bulo, Samuel Rota
    Porzi, Lorenzo
    Lopez-Antequera, Manuel
    Kontschieder, Peter
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 1991 - 1999