Monocular 3D Object Detection for Autonomous Driving Based on Contextual Transformer

被引：0

作者：

She, Xiangyang ^{[1
]}

Yan, Weijia ^{[1
]}

Dong, Lihong ^{[1
]}

机构：

[1] College of Computer Science and Technology, Xi'an University of Science and Technology, Xi'an,710054, China

来源：

Computer Engineering and Applications | 2024年 / 60卷 / 19期

关键词：

D O I：

10.3778/j.issn.1002-8331.2307-0084

中图分类号：

学科分类号：

摘要：

Aiming at the current problems of leakage and poor multi-scale target detection in monocular 3D object detection, a monocular 3D object detection algorithm for autonomous driving based on Contextual Transformer (CM-RTM3D) is proposed. Firstly, Contextual Transformer (CoT) is introduced into the ResNet-50 network to construct the ResNet-Transformer architecture for feature extraction. Secondly, the multi-scale spatial perception (MSP) module is designed to improve the loss of shallow features through scale-space response operations, embedding the coordinate attention mechanism (CA) along both horizontal and vertical spatial directions, and generating soft weights of importance at each scale using the softmax function. Finally, the Huber loss function is used instead of the L1 loss function in the offset loss. The experimental results show that, compared with the RTM3D algorithm on the KITTI autopilot dataset, the algorithm in this paper improves AP3D by 4.84, 3.82, and 5.36 percentage points, and APBEV by 4.75, 6.26, and 3.56 percentage points, respectively, at the three difficulty levels of easy, medium, and difficult. © 2024 Journal of Computer Engineering and Applications Beijing Co., Ltd.; Science Press. All rights reserved.

引用

页码：178 / 189

共 50 条

[21] Stereo CenterNet-based 3D object detection for autonomous driving
Shi, Yuguang
Guo, Yu
Mi, Zhenqiang
Li, Xinjie
NEUROCOMPUTING, 2022, 471 : 219 - 229
[22] 3D Object Detection for Autonomous Driving: A Practical Survey
Ramajo-Ballester, Alvaro
de la Escalera Hueso, Arturo
Armingol Moreno, Jose Maria
PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON VEHICLE TECHNOLOGY AND INTELLIGENT TRANSPORT SYSTEMS, VEHITS 2023, 2023, : 64 - 73
[23] 3D Object Detection for Autonomous Driving: A Comprehensive Survey
Jiageng Mao
Shaoshuai Shi
Xiaogang Wang
Hongsheng Li
International Journal of Computer Vision, 2023, 131 : 1909 - 1963
[24] 3D Object Detection for Autonomous Driving: A Comprehensive Survey
Mao, Jiageng
Shi, Shaoshuai
Wang, Xiaogang
Li, Hongsheng
INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 1909 - 1963
[25] On Offline Evaluation of 3D Object Detection for Autonomous Driving
Schreier, Tim
Renz, Katrin
Geiger, Andreas
Chitta, Kashyap
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4086 - 4091
[26] Time3D: End-to-End Joint Monocular 3D Object Detection and Tracking for Autonomous Driving
Li, Peixuan
Jin, Jieyu
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3875 - 3884
[27] 3D object detection algorithms in autonomous driving: A review
Ren K.-Y.
Gu M.-Y.
Yuan Z.-Q.
Yuan S.
Kongzhi yu Juece/Control and Decision, 2023, 38 (04): : 865 - 889
[28] MonoGhost: Lightweight Monocular GhostNet 3D Object Properties Estimation for Autonomous Driving
El-Dawy, Ahmed
El-Zawawi, Amr
El-Habrouk, Mohamed
ROBOTICS, 2023, 12 (06)
[29] MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer
Huang, Kuan-Chih
Wu, Tsung-Han
Su, Hung-Ting
Hsu, Winston H.
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 4002 - 4011
[30] MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection
Zhang, Renrui
Qiu, Han
Wang, Tai
Guo, Ziyu
Cui, Ziteng
Qiao, Yu
Li, Hongsheng
Gao, Peng
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9121 - 9132

← 1 2 3 4 5 →