Query-based Temporal Fusion with Explicit Motion for 3D Object Detection

被引：0

作者：

Hou, Jinghua ^{[1
]}

Liu, Zhe ^{[1
]}

Liang, Dingkang ^{[1
]}

Zou, Zhikang ^{[2
]}

Ye, Xiaoqing ^{[2
]}

Bai, Xiang ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China

[2] Baidu Inc, Beijing, Peoples R China

来源：

ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Effectively utilizing temporal information to improve 3D detection performance is vital for autonomous driving vehicles. Existing methods either conduct temporal fusion based on the dense BEV features or sparse 3D proposal features. However, the former does not pay more attention to foreground objects, leading to more computation costs and sub-optimal performance. The latter implements time-consuming operations to generate sparse 3D proposal features, and the performance is limited by the quality of 3D proposals. In this paper, we propose a simple and effective Query-based Temporal Fusion Network (QTNet). The main idea is to exploit the object queries in previous frames to enhance the representation of current object queries by the proposed Motion-guided Temporal Modeling (MTM) module, which utilizes the spatial position information of object queries along the temporal dimension to construct their relevance between adjacent frames reliably. Experimental results show our proposed QTNet outperforms BEV-based or proposal-based manners on the nuScenes dataset. Besides, the MTM is a plug-and-play module, which can be integrated into some advanced LiDAR-only or multi-modality 3D detectors and even brings new SOTA performance with negligible computation cost and latency on the nuScenes dataset. These experiments powerfully illustrate the superiority and generalization of our method. The code is available at https://github.com/AlmoonYsl/QTNet.

引用

页数：16

共 50 条

[21] PointPainting: Sequential Fusion for 3D Object Detection
Vora, Sourabh
Lang, Alex H.
Helou, Bassam
Beijbom, Oscar
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 4603 - 4611
[22] A multilevel fusion network for 3D object detection
Xia, Chunlong
Wei, Ping
Wei, Wenwen
Zheng, Nanning
Neurocomputing, 2021, 437 : 107 - 117
[23] Dense projection fusion for 3D object detection
Chen, Zhao
Hu, Bin-Jie
Luo, Chengxi
Chen, Guohao
Zhu, Haohui
SCIENTIFIC REPORTS, 2024, 14 (01):
[24] Sparse Dense Fusion for 3D Object Detection
Gao, Yulu
Sima, Chonghao
Shi, Shaoshuai
Di, Shangzhe
Liu, Si
Li, Hongyang
2023 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2023, : 10939 - 10946
[25] Voxel Field Fusion for 3D Object Detection
Li, Yanwei
Qi, Xiaojuan
Chen, Yukang
Wang, Liwei
Li, Zeming
Sun, Jian
Jia, Jiaya
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1110 - 1119
[26] Radar Voxel Fusion for 3D Object Detection
Nobis, Felix
Shafiei, Ehsan
Karle, Phillip
Betz, Johannes
Lienkamp, Markus
APPLIED SCIENCES-BASEL, 2021, 11 (12):
[27] Fully Sparse Fusion for 3D Object Detection
Li, Yingyan
Fan, Lue
Liu, Yang
Huang, Zehao
Chen, Yuntao
Wang, Naiyan
Zhang, Zhaoxiang
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (11) : 7217 - 7231
[28] Query-Based Hard-Image Retrieval for Object Detection at Test Time
Ayers, Edward
Sadeghi, Jonathan
Redford, John
Mueller, Romain
Dokania, Puneet K.
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 12, 2023, : 14692 - 14700
[29] Fusion information enhanced method based on transformer for 3D object detection
Jin Y.
Tao C.
Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2023, 44 (12): : 297 - 306
[30] 3D object detection based on image and LIDAR fusion for autonomous driving
Chen G.
Yi H.
Mao Z.
International Journal of Vehicle Information and Communication Systems, 2023, 8 (03) : 237 - 251

← 1 2 3 4 5 →