Adapting Depth Distribution for 3D Object Detection with a Two-Stage Training Paradigm

被引：0

作者：

Luo, Yixin ^{[1
,2
]}

Huang, Zhangjin ^{[1
,2
]}

Bao, Zhongkui ^{[3
]}

机构：

[1] Univ Sci & Technol China, Hefei 230027, Peoples R China

[2] Deqing Alpha Innovat Inst, Huzhou 313299, Peoples R China

[3] Anhui Univ, Hefei 230601, Peoples R China

来源：

ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XI, ICIC 2024 | 2024年 / 14872卷

基金：

国家重点研发计划;

关键词：

3D Object Detection; Depth Estimation; Two-Stage Training;

D O I：

10.1007/978-981-97-5612-4_6

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Lift-Splat-Shoot based 3D object detection systems aim to predict the targets' bounding boxes from images, by leveraging an explicit depth distribution that facilitates coherence between the depth and detection modules. Contrary to conventional end-to-end models that prioritize minimizing the disparity between estimated and ground-truth depth maps, our study underscores the intrinsic value of the depth distribution itself. To exploit this perspective, we introduce a novel two-stage training paradigm designed to optimize the depth and detection module separately, adopting a targeted approach to refine the depth distribution for 3D object detection. Specifically, the first stage involves training the depth module for precise depth estimation, which is supplemented by an auxiliary detection module that provides additional supervisory feedback for detection accuracy. This auxiliary component is designed to be discarded once it has served its purpose in improving the depth distribution. For the second stage, with the depth module's parameters now fixed, we train a fresh detection module from scratch under direct detection supervision. Additionally, a trainable and lightweight depth adapter is incorporated post the depth module to further adapt and polish the depth distribution, aligning it more closely with the detection objectives. Our experiments on the nuScenes dataset reveal that our approach significantly surpasses baseline models, achieving a notable 1.13% improvement on the NDS metric.

引用

页码：62 / 73

页数：12

共 50 条

[1] TSF: Two-Stage Sequential Fusion for 3D Object Detection
Qi, Heng
Shi, Peicheng
Liu, Zhiqiang
Yang, Aixi
IEEE SENSORS JOURNAL, 2022, 22 (12) : 12163 - 12172
[2] Fast Two-Stage 3D Object Detection with Semantic Guidance
Huang Mang
Hui Bin
Liu Zhaoji
Jin Tianming
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (12)
[3] TSFF: a two-stage fusion framework for 3D object detection
Jiang, Guoqing
Li, Saiya
Huang, Ziyu
Cai, Guorong
Su, Jinhe
PEERJ COMPUTER SCIENCE, 2024, 10
[4] TSFF: a two-stage fusion framework for 3D object detection
Jiang, Guoqing
Li, Saiya
Huang, Ziyu
Cai, Guorong
Su, Jinhe
PeerJ Computer Science, 2024, 10
[5] Two-stage 3D object detection guided by position encoding q
Xu, Wanpeng
Zou, Ling
Fu, Zhipeng
Wu, Lingda
Qi, Yue
NEUROCOMPUTING, 2022, 501 : 811 - 821
[6] ETS-3D: An Efficient Two-Stage Framework for Stereo 3D Object Detection
Ji, Chaofeng
Liu, Guizhong
Zhao, Dan
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2022, 88
[7] ImFusion: Boosting Two-Stage 3D Object Detection via Image Candidates
Tao, Manli
Zhao, Chaoyang
Wang, Jinqiao
Tang, Ming
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 241 - 245
[8] FusionRCNN: LiDAR-Camera Fusion for Two-Stage 3D Object Detection
Xu, Xinli
Dong, Shaocong
Xu, Tingfa
Ding, Lihe
Wang, Jie
Jiang, Peng
Song, Liqiang
Li, Jianan
REMOTE SENSING, 2023, 15 (07)
[9] Categorical Depth Distribution Network for Monocular 3D Object Detection
Reading, Cody
Harakeh, Ali
Chae, Julia
Waslander, Steven L.
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8551 - 8560
[10] Improved Two-Stage 3D Object Detection Algorithm for Roadside Scenes with Enhanced PointPillars and Transformer
Wang Liangzi
Huang Miaohua
Liu Ruoying
Bi Chengcheng
Hu Yongkang
LASER & OPTOELECTRONICS PROGRESS, 2024, 61 (18)

← 1 2 3 4 5 →