SF3D: SlowFast Temporal 3D Object Detection

被引：0

作者：

Wang, Renhao ^{[1
,2
]}

Yu, Zhiding ^{[2
]}

Lan, Shiyi ^{[2
]}

Xie, Enze ^{[2
,3
]}

Chen, Ke ^{[2
]}

Anandkumar, Anima ^{[2
,4
]}

Alvarez, Jose M. ^{[2
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] NVIDIA, Santa Clara, CA USA

[3] Univ Hong Kong, Hong Kong, Peoples R China

[4] CALTECH, Pasadena, CA 91125 USA

来源：

2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Leveraging inputs over multiple consecutive frames has been shown to benefit 3D object detection. However, existing approaches often demonstrate unsatisfactory scaling with increasing temporal histories. In this work, we propose SF3D, a late fusion module which addresses this issue by better modeling temporal relationships via a two-stream factorization. Concretely, SF3D operates on an input sequence of consecutive bird's-eye view (BEV) features, which is partitioned into "short-term" and "long-term" frames. A more heavily parameterized short-term branch using adapters and deformable attention aggregates features closer to the current timestep. In parallel, a long-term branch composed of efficiently implemented global convolution layers aggregates a larger window of temporally distant historical features. This two-stream paradigm allows SF3D to effectively consume near-term information, while scaling to efficiently leverage longer historical windows. We show that SF3D works with arbitrary upstream BEV encoders and downstream detectors, achieving improvements over recent state-of-the-art on the Waymo Open and nuScenes benchmarks.

引用

页码：1280 / 1285

页数：6

共 50 条

[1] A robust 3D unique descriptor for 3D object detection
Joshi, Piyush
Rastegarpanah, Alireza
Stolkin, Rustam
PATTERN ANALYSIS AND APPLICATIONS, 2024, 27 (03)
[2] Gated3D: Monocular 3D Object Detection From Temporal Illumination Cues
Julca-Aguilar, Frank
Taylor, Jason
Bijelic, Mario
Mannan, Fahim
Tseng, Ethan
Heide, Felix
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 2918 - 2928
[3] 3D Object Detection with Pointformer
Pan, Xuran
Xia, Zhuofan
Song, Shiji
Li, Li Erran
Huang, Gao
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 7459 - 7468
[4] A survey of 3D object detection
Wei Liang
Pengfei Xu
Ling Guo
Heng Bai
Yang Zhou
Feng Chen
Multimedia Tools and Applications, 2021, 80 : 29617 - 29641
[5] A survey of 3D object detection
Liang, Wei
Xu, Pengfei
Guo, Ling
Bai, Heng
Zhou, Yang
Chen, Feng
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (19) : 29617 - 29641
[6] 3D object watermarking by a 3D hidden object
Kishk, S
Javidi, B
OPTICS EXPRESS, 2003, 11 (08): : 874 - 888
[7] Temp-Frustum Net: 3D Object Detection with Temporal Fusion
Ercelik, Emec
Yurtsever, Ekim
Knoll, Alois
2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 1095 - 1101
[8] 3D OBJECT DETECTION FOR AUTONOMOUS DRIVING USING TEMPORAL LIDAR DATA
McCrae, Scott
Zakhor, Avideh
2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2661 - 2665
[9] Anchor-Based Transformer for Temporal LiDAR 3D Object Detection
Gu, Rongqi
Wu, Fei
Liu, Peigen
Yang, Chu
Lu, Yaohan
Chen, Guang
2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 45 - 50
[10] Monocular 3D Object Detection with Bounding Box Denoising in 3D by Perceiver
Liu, Xianpeng
Zheng, Ce
Cheng, Kelvin
Xue, Nan
Qi, Guo-Jun
Wu, Tianfu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 6413 - 6423

← 1 2 3 4 5 →