Query-based Temporal Fusion with Explicit Motion for 3D Object Detection

被引:0
|
作者
Hou, Jinghua [1 ]
Liu, Zhe [1 ]
Liang, Dingkang [1 ]
Zou, Zhikang [2 ]
Ye, Xiaoqing [2 ]
Bai, Xiang [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan, Hubei, Peoples R China
[2] Baidu Inc, Beijing, Peoples R China
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Effectively utilizing temporal information to improve 3D detection performance is vital for autonomous driving vehicles. Existing methods either conduct temporal fusion based on the dense BEV features or sparse 3D proposal features. However, the former does not pay more attention to foreground objects, leading to more computation costs and sub-optimal performance. The latter implements time-consuming operations to generate sparse 3D proposal features, and the performance is limited by the quality of 3D proposals. In this paper, we propose a simple and effective Query-based Temporal Fusion Network (QTNet). The main idea is to exploit the object queries in previous frames to enhance the representation of current object queries by the proposed Motion-guided Temporal Modeling (MTM) module, which utilizes the spatial position information of object queries along the temporal dimension to construct their relevance between adjacent frames reliably. Experimental results show our proposed QTNet outperforms BEV-based or proposal-based manners on the nuScenes dataset. Besides, the MTM is a plug-and-play module, which can be integrated into some advanced LiDAR-only or multi-modality 3D detectors and even brings new SOTA performance with negligible computation cost and latency on the nuScenes dataset. These experiments powerfully illustrate the superiority and generalization of our method. The code is available at https://github.com/AlmoonYsl/QTNet.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] ATTENTION DECOUPLING FOR QUERY-BASED OBJECT DETECTION
    Ma, Jia-Wei
    Liang, Min
    Man, Haixia
    Tian, Shu
    Qin, Jingyan
    Yin, Xu-Cheng
    2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, ICASSP 2024, 2024, : 2850 - 2854
  • [2] A Component for Query-based Object Detection in Crowded Scenes
    Mao, Shuo
    2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 205 - 209
  • [3] LiDAR-Based 3D Temporal Object Detection via Motion-Aware LiDAR Feature Fusion
    Park, Gyuhee
    Koh, Junho
    Kim, Jisong
    Moon, Jun
    Choi, Jun Won
    SENSORS, 2024, 24 (14)
  • [4] Query-based Composition of Animations for 3D Web Applications
    Flotynski, Jakub
    Krzyszkowski, Marcin
    Walczak, Krzysztof
    WEB3D 2018: THE 23RD INTERNATIONAL ACM CONFERENCE ON 3D WEB TECHNOLOGY, 2018,
  • [5] Frame Fusion with Vehicle Motion Prediction for 3D Object Detection
    Li, Xirui
    Wang, Feng
    Wang, Naiyan
    Ma, Chao
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA 2024, 2024, : 4252 - 4258
  • [6] Semantic Query-based Generation of Customized 3D Scenes
    Walczak, Krzysztof
    Flotynski, Jakub
    WEB3D 2015, 2015, : 123 - 131
  • [7] Multimodal Object Query Initialization for 3D Object Detection
    van Geerenstein, Mathijs R.
    Ruppel, Felicia
    Dietmayers, Klaus
    Gavrila, Dariu M.
    2024 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2024), 2024, : 12484 - 12491
  • [8] Enhanced Training of Query-Based Object Detection via Selective Query Recollection
    Chen, Fangyi
    Zhang, Han
    Hu, Kai
    Huang, Yu-Kai
    Zhu, Chenchen
    Savvides, Marios
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 23756 - 23765
  • [9] Temp-Frustum Net: 3D Object Detection with Temporal Fusion
    Ercelik, Emec
    Yurtsever, Ekim
    Knoll, Alois
    2021 32ND IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2021, : 1095 - 1101
  • [10] Slice coherence in a query-based architecture for 3D heterogeneous printing
    Yaman, Ulas
    Butt, Nabeel
    Sacks, Elisha
    Hoffmann, Christoph
    COMPUTER-AIDED DESIGN, 2016, 75-76 : 27 - 38