Query-based Temporal Fusion with Explicit Motion for 3D Object Detection

被引：0

作者：

Hou, Jinghua ^{[1
]}

Liu, Zhe ^{[1
]}

Liang, Dingkang ^{[1
]}

Zou, Zhikang ^{[2
]}

Ye, Xiaoqing ^{[2
]}

Bai, Xiang ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China

[2] Baidu Inc, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effectively utilizing temporal information to improve 3D detection performance is vital for autonomous driving vehicles. Existing methods either conduct temporal fusion based on the dense BEV features or sparse 3D proposal features. However, the former does not pay more attention to foreground objects, leading to more computation costs and sub-optimal performance. The latter implements time-consuming operations to generate sparse 3D proposal features, and the performance is limited by the quality of 3D proposals. In this paper, we propose a simple and effective Query-based Temporal Fusion Network (QTNet). The main idea is to exploit the object queries in previous frames to enhance the representation of current object queries by the proposed Motion-guided Temporal Modeling (MTM) module, which utilizes the spatial position information of object queries along the temporal dimension to construct their relevance between adjacent frames reliably. Experimental results show our proposed QTNet outperforms BEV-based or proposal-based manners on the nuScenes dataset. Besides, the MTM is a plug-and-play module, which can be integrated into some advanced LiDAR-only or multi-modality 3D detectors and even brings new SOTA performance with negligible computation cost and latency on the nuScenes dataset. These experiments powerfully illustrate the superiority and generalization of our method. The code is available at https://github.com/AlmoonYsl/QTNet.

引用

页数：16

共 50 条

[31] 3D Object Detection Based on Feature Fusion of Point Cloud Sequences
Zhai, Zhenyu
Wang, Qiantong
Pan, Zongxu
Hu, Wenlong
Hu, Yuxin
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1240 - 1245
[32] Moving object detection algorithm and motion capture based on 3D LiDAR
Jiang, Jian
OPTICAL AND QUANTUM ELECTRONICS, 2024, 56 (04)
[33] Anchor-Based Transformer for Temporal LiDAR 3D Object Detection
Gu, Rongqi
Wu, Fei
Liu, Peigen
Yang, Chu
Lu, Yaohan
Chen, Guang
2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 45 - 50
[34] SF3D: SlowFast Temporal 3D Object Detection
Wang, Renhao
Yu, Zhiding
Lan, Shiyi
Xie, Enze
Chen, Ke
Anandkumar, Anima
Alvarez, Jose M.
2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1280 - 1285
[35] Hand Motion Recognition Based on A 3D Fingertip Detection Fusion Method
Wang, Jun
Qian, Jiuchao
Ying, Rendong
Wang, Weihang
Jin, Ke
Liu, Peilin
PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 510 - 515
[36] Query-based debugging of object-oriented programs
Lencevicius, R
Holzle, U
Singh, AK
ACM SIGPLAN NOTICES, 1997, 32 (10) : 304 - 317
[37] Towards Raw Sensor Fusion in 3D Object Detection
Rovid, Andras
Remeli, Viktor
2019 IEEE 17TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2019), 2019, : 293 - 298
[38] Monocular 3D Object Detection With Motion Feature Distillation
Hu, Henan
Li, Muyu
Zhu, Ming
Gao, Wen
Liu, Peiyu
Chan, Kwok-Leung
IEEE ACCESS, 2023, 11 : 82933 - 82945
[39] Fast 3D Object Motion Detection Algorithm Design
Li, Shih-An
Ho, Yun-Hung
Wong, Ching-Chang
Feng, Hsuan-Ming
2024 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE, SEAI 2024, 2024, : 68 - 74
[40] Monocular 3D Object Detection with Depth from Motion
Wang, Tai
Pang, Jiangmiao
Lin, Dahua
COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 386 - 403

← 1 2 3 4 5 →