Query-based Temporal Fusion with Explicit Motion for 3D Object Detection

被引:0
|
作者
Hou, Jinghua [1 ]
Liu, Zhe [1 ]
Liang, Dingkang [1 ]
Zou, Zhikang [2 ]
Ye, Xiaoqing [2 ]
Bai, Xiang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Baidu Inc, Beijing, Peoples R China
来源
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023) | 2023年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effectively utilizing temporal information to improve 3D detection performance is vital for autonomous driving vehicles. Existing methods either conduct temporal fusion based on the dense BEV features or sparse 3D proposal features. However, the former does not pay more attention to foreground objects, leading to more computation costs and sub-optimal performance. The latter implements time-consuming operations to generate sparse 3D proposal features, and the performance is limited by the quality of 3D proposals. In this paper, we propose a simple and effective Query-based Temporal Fusion Network (QTNet). The main idea is to exploit the object queries in previous frames to enhance the representation of current object queries by the proposed Motion-guided Temporal Modeling (MTM) module, which utilizes the spatial position information of object queries along the temporal dimension to construct their relevance between adjacent frames reliably. Experimental results show our proposed QTNet outperforms BEV-based or proposal-based manners on the nuScenes dataset. Besides, the MTM is a plug-and-play module, which can be integrated into some advanced LiDAR-only or multi-modality 3D detectors and even brings new SOTA performance with negligible computation cost and latency on the nuScenes dataset. These experiments powerfully illustrate the superiority and generalization of our method. The code is available at https://github.com/AlmoonYsl/QTNet.
引用
收藏
页数:16
相关论文
共 50 条
  • [31] 3D Object Detection Based on Feature Fusion of Point Cloud Sequences
    Zhai, Zhenyu
    Wang, Qiantong
    Pan, Zongxu
    Hu, Wenlong
    Hu, Yuxin
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 1240 - 1245
  • [32] Moving object detection algorithm and motion capture based on 3D LiDAR
    Jiang, Jian
    OPTICAL AND QUANTUM ELECTRONICS, 2024, 56 (04)
  • [33] Anchor-Based Transformer for Temporal LiDAR 3D Object Detection
    Gu, Rongqi
    Wu, Fei
    Liu, Peigen
    Yang, Chu
    Lu, Yaohan
    Chen, Guang
    2024 INTERNATIONAL CONFERENCE ON ADVANCED ROBOTICS AND MECHATRONICS, ICARM 2024, 2024, : 45 - 50
  • [34] SF3D: SlowFast Temporal 3D Object Detection
    Wang, Renhao
    Yu, Zhiding
    Lan, Shiyi
    Xie, Enze
    Chen, Ke
    Anandkumar, Anima
    Alvarez, Jose M.
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 1280 - 1285
  • [35] Hand Motion Recognition Based on A 3D Fingertip Detection Fusion Method
    Wang, Jun
    Qian, Jiuchao
    Ying, Rendong
    Wang, Weihang
    Jin, Ke
    Liu, Peilin
    PROCEEDINGS 2017 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI), 2017, : 510 - 515
  • [36] Query-based debugging of object-oriented programs
    Lencevicius, R
    Holzle, U
    Singh, AK
    ACM SIGPLAN NOTICES, 1997, 32 (10) : 304 - 317
  • [37] Towards Raw Sensor Fusion in 3D Object Detection
    Rovid, Andras
    Remeli, Viktor
    2019 IEEE 17TH WORLD SYMPOSIUM ON APPLIED MACHINE INTELLIGENCE AND INFORMATICS (SAMI 2019), 2019, : 293 - 298
  • [38] Monocular 3D Object Detection With Motion Feature Distillation
    Hu, Henan
    Li, Muyu
    Zhu, Ming
    Gao, Wen
    Liu, Peiyu
    Chan, Kwok-Leung
    IEEE ACCESS, 2023, 11 : 82933 - 82945
  • [39] Fast 3D Object Motion Detection Algorithm Design
    Li, Shih-An
    Ho, Yun-Hung
    Wong, Ching-Chang
    Feng, Hsuan-Ming
    2024 IEEE 4TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND ARTIFICIAL INTELLIGENCE, SEAI 2024, 2024, : 68 - 74
  • [40] Monocular 3D Object Detection with Depth from Motion
    Wang, Tai
    Pang, Jiangmiao
    Lin, Dahua
    COMPUTER VISION, ECCV 2022, PT IX, 2022, 13669 : 386 - 403