Learning Temporal Cues by Predicting Objects Move for Multi-camera 3D Object Detection

被引:0
|
作者
Moon, Seokha [1 ]
Park, Hongbeen [1 ]
Lee, Jaekoo [2 ]
Kim, Jinkyu [1 ]
机构
[1] Korea Univ, Dept Comp Sci & Engn, Seoul, South Korea
[2] Kookmin Univ, Coll Comp Sci, Seoul, South Korea
基金
新加坡国家研究基金会;
关键词
D O I
10.1109/ICRA57147.2024.10610934
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In autonomous driving and robotics, there is a growing interest in utilizing short-term historical data to enhance multi-camera 3D object detection, leveraging the continuous and correlated nature of input video streams. Recent work has focused on spatially aligning BEV-based features over timesteps. However, this is often limited as its gain does not scale well with long-term past observations. To address this, we advocate for supervising a model to predict objects' poses given past observations, thus explicitly guiding to learn objects' temporal cues. To this end, we propose a model called DAP (Detection After Prediction), consisting of a two-branch network: (i) a branch responsible for forecasting the current objects' poses given past observations and (ii) another branch that detects objects based on the current and past observations. The features predicting the current objects from branch (i) is fused into branch (ii) to transfer predictive knowledge. We conduct extensive experiments with the large-scale nuScenes datasets, and we observe that utilizing such predictive information significantly improves the overall detection performance. Our model can be used plug-and-play, showing consistent performance gain.
引用
收藏
页码:6607 / 6613
页数:7
相关论文
共 50 条
  • [41] MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception
    Zhou, Hongyu
    Ge, Zheng
    Li, Zeming
    Zhang, Xiangyu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8514 - 8523
  • [42] Multi-camera 3D ball tracking framework for sports video
    Wu, Wanneng
    Xu, Min
    Liang, Qiaokang
    Mei, Li
    Peng, Yu
    IET IMAGE PROCESSING, 2020, 14 (15) : 3751 - 3761
  • [43] 3D reconstruction of a compressible flow by synchronized multi-camera BOS
    Nicolas, F.
    Donjat, D.
    Leon, O.
    Le Besnerais, G.
    Champagnat, F.
    Micheli, F.
    EXPERIMENTS IN FLUIDS, 2017, 58 (05)
  • [44] A new metrological characterization strategy for 3D multi-camera systems
    Michaela Servi
    Francesco Buonamici
    Luca Puggelli
    Yary Volpe
    International Journal on Interactive Design and Manufacturing (IJIDeM), 2021, 15 : 69 - 72
  • [45] SurroundOcc: Multi-Camera 3D Occupancy Prediction for Autonomous Driving
    Wei, Yi
    Zhao, Linqing
    Zheng, Wenzhao
    Zhu, Zheng
    Zhou, Jie
    Lu, Jiwen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 21672 - 21683
  • [46] RetryTRACK: Recovering Misses in Multi-Camera 3D Pedestrian Tracking
    de Andrade, Isabella
    Lima, Joao Paulo
    Teichrieb, Veronica
    2024 37TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES, SIBGRAPI 2024, 2024, : 145 - 150
  • [47] 3D reconstruction of a compressible flow by synchronized multi-camera BOS
    F. Nicolas
    D. Donjat
    O. Léon
    G. Le Besnerais
    F. Champagnat
    F. Micheli
    Experiments in Fluids, 2017, 58
  • [48] Multi-camera Microenvironment to Capture Multi-view Time-Lapse Videos for 3D Analysis of Aging Objects
    Guo, Lintao
    Quant, Hunter
    Lamb, Nikolas
    Lowit, Benjamin
    Banerjee, Natasha Kholgade
    Banerjee, Sean
    MULTIMEDIA MODELING, MMM 2018, PT II, 2018, 10705 : 381 - 385
  • [49] A new metrological characterization strategy for 3D multi-camera systems
    Servi, Michaela
    Buonamici, Francesco
    Puggelli, Luca
    Volpe, Yary
    INTERNATIONAL JOURNAL OF INTERACTIVE DESIGN AND MANUFACTURING - IJIDEM, 2021, 15 (01): : 69 - 72
  • [50] A Robust Multi-Camera 3D Ellipse Fitting for Contactless Measurements
    Bergamasco, Filippo
    Cosmo, Luca
    Albarelli, Andrea
    Torsello, Andrea
    SECOND JOINT 3DIM/3DPVT CONFERENCE: 3D IMAGING, MODELING, PROCESSING, VISUALIZATION & TRANSMISSION (3DIMPVT 2012), 2012, : 168 - 175