SF3D: SlowFast Temporal 3D Object Detection

被引：0

作者：

Wang, Renhao ^{[1
,2
]}

Yu, Zhiding ^{[2
]}

Lan, Shiyi ^{[2
]}

Xie, Enze ^{[2
,3
]}

Chen, Ke ^{[2
]}

Anandkumar, Anima ^{[2
,4
]}

Alvarez, Jose M. ^{[2
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] NVIDIA, Santa Clara, CA USA

[3] Univ Hong Kong, Hong Kong, Peoples R China

[4] CALTECH, Pasadena, CA 91125 USA

来源：

2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024 | 2024年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Leveraging inputs over multiple consecutive frames has been shown to benefit 3D object detection. However, existing approaches often demonstrate unsatisfactory scaling with increasing temporal histories. In this work, we propose SF3D, a late fusion module which addresses this issue by better modeling temporal relationships via a two-stream factorization. Concretely, SF3D operates on an input sequence of consecutive bird's-eye view (BEV) features, which is partitioned into "short-term" and "long-term" frames. A more heavily parameterized short-term branch using adapters and deformable attention aggregates features closer to the current timestep. In parallel, a long-term branch composed of efficiently implemented global convolution layers aggregates a larger window of temporally distant historical features. This two-stream paradigm allows SF3D to effectively consume near-term information, while scaling to efficiently leverage longer historical windows. We show that SF3D works with arbitrary upstream BEV encoders and downstream detectors, achieving improvements over recent state-of-the-art on the Waymo Open and nuScenes benchmarks.

引用

页码：1280 / 1285

页数：6

共 50 条

[31] 3D sketching for 3D object retrieval
Bo Li
Juefei Yuan
Yuxiang Ye
Yijuan Lu
Chaoyang Zhang
Qi Tian
Multimedia Tools and Applications, 2021, 80 : 9569 - 9595
[32] 3D, SF and the future
Birtchnell, Thomas
Urry, John
FUTURES, 2013, 50 : 25 - 34
[33] Multimodal Object Query Initialization for 3D Object Detection
van Geerenstein, Mathijs R.
Ruppel, Felicia
Dietmayers, Klaus
Gavrila, Dariu M.
2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12484 - 12491
[34] 3D Object Proposals for Accurate Object Class Detection
Chen, Xiaozhi
Kundu, Kaustav
Zhu, Yukun
Berneshawi, Andrew
Ma, Huimin
Fidler, Sanja
Urtasun, Raquel
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 28 (NIPS 2015), 2015, 28
[35] Temporal 3D RetinaNet for fish detection
Shen, Zhou
Chuong Nguyen
2020 DIGITAL IMAGE COMPUTING: TECHNIQUES AND APPLICATIONS (DICTA), 2020,
[36] Reinforcing LiDAR-Based 3D Object Detection with RGB and 3D Information
Liu, Wenjian
Zhou, Yue
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 199 - 209
[37] MonoSample: Synthetic 3D Data Augmentation Method in Monocular 3D Object Detection
Qiao, Junchao
Liu, Biao
Yang, Jiaqi
Wang, Baohua
Xiu, Sanmu
Du, Xin
Nie, Xiaobo
IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (08): : 7326 - 7332
[38] SGM3D: Stereo Guided Monocular 3D Object Detection
Zhou, Zheyuan
Du, Liang
Ye, Xiaoqing
Zou, Zhikang
Tan, Xiao
Zhang, Li
Xue, Xiangyang
Feng, Jianfeng
IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (04) : 10478 - 10485
[39] FocalFormer3D: Focusing on Hard Instance for 3D Object Detection
Chen, Yilun
Yu, Zhiding
Chen, Yukang
Lan, Shiyi
Anandkumar, Anima
Jia, Jiaya
Alvarez, Jose M.
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8360 - 8371
[40] KPP3D:Key Point Painting for 3D Object Detection
Wang, Mingming
Chen, Qingkui
Fu, Zhibing
Computer Engineering and Applications, 2023, 59 (17) : 195 - 204

← 1 2 3 4 5 →