Multi-Modal Streaming 3D Object Detection

被引:0
|
作者
Abdelfattah, Mazen [1 ]
Yuan, Kaiwen [1 ,2 ]
Wang, Z. Jane [1 ]
Ward, Rabab [1 ]
机构
[1] Univ British Columbia, Dept Elect & Comp Engn, Vancouver, BC V6T 1Z4, Canada
[2] Safari AI Inc, Claymont, DE 19703 USA
基金
加拿大自然科学与工程研究理事会;
关键词
RGB-D Perception; sensor fusion; streaming 3D object detection; CALIBRATION;
D O I
10.1109/LRA.2023.3303696
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Modern autonomous vehicles rely heavily on mechanical LiDARs for perception. Current perception methods generally require $360<^>\circ$ point clouds, collected sequentially as the LiDAR scans the azimuth and acquires consecutive wedge-shaped slices. The acquisition latency of a full scan ($\sim\!\text{100}\;\text{ms}$) may lead to outdated perception which is detrimental to safe operation. Recent streaming perception works proposed directly processing LiDAR slices and compensating for the narrow field of view (FOV) of a slice by reusing features from preceding slices. These works, however, are all based on a single modality and require past information which may be outdated. Meanwhile, images from high-frequency cameras can support streaming models as they provide a larger FoV compared to a LiDAR slice. However, this difference in FoV complicates sensor fusion. We propose an innovative camera-LiDAR streaming 3D object detection framework that uses camera images instead of past LiDAR slices to provide an up-to-date, dense, and wide context for streaming perception. The proposed method outperforms prior streaming models and powerful full-scan baselines on the challenging NuScenes benchmark in detection accuracy and end-to-end runtime. Our method is shown to be robust to missing camera images, narrow LiDAR slices, and small camera-LiDAR miscalibration.
引用
收藏
页码:6163 / 6170
页数:8
相关论文
共 50 条
  • [1] Multi-Modal 3D Object Detection by Box Matching
    Liu, Zhe
    Ye, Xiaoqing
    Zou, Zhikang
    He, Xinwei
    Tan, Xiao
    Ding, Errui
    Wang, Jingdong
    Bai, Xiang
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [2] Quantization to accelerate inference in multi-modal 3D object detection
    Geerhart, Billy
    Dasari, Venkat R.
    Rapp, Brian
    Wang, Peng
    Wang, Ju
    Payne, Christopher X.
    [J]. DISRUPTIVE TECHNOLOGIES IN INFORMATION SCIENCES VIII, 2024, 13058
  • [3] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Wang, Yingjie
    Mao, Qiuyu
    Zhu, Hanqi
    Deng, Jiajun
    Zhang, Yu
    Ji, Jianmin
    Li, Houqiang
    Zhang, Yanyong
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2023, 131 (08) : 2122 - 2152
  • [4] Multi-Modal 3D Object Detection in Autonomous Driving: A Survey
    Yingjie Wang
    Qiuyu Mao
    Hanqi Zhu
    Jiajun Deng
    Yu Zhang
    Jianmin Ji
    Houqiang Li
    Yanyong Zhang
    [J]. International Journal of Computer Vision, 2023, 131 : 2122 - 2152
  • [5] ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Ngo, Chong-Wah
    Mei, Tao
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18021 - 18030
  • [6] Deep multi-scale and multi-modal fusion for 3D object detection
    Guo, Rui
    Li, Deng
    Han, Yahong
    [J]. PATTERN RECOGNITION LETTERS, 2021, 151 : 236 - 242
  • [7] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [8] Multi-modal feature fusion for 3D object detection in the production workshop
    Hou, Rui
    Chen, Guangzhu
    Han, Yinhe
    Tang, Zaizuo
    Ru, Qingjun
    [J]. APPLIED SOFT COMPUTING, 2022, 115
  • [9] Deformable Feature Aggregation for Dynamic Multi-modal 3D Object Detection
    Chen, Zehui
    Li, Zhenyu
    Zhang, Shiquan
    Fang, Liangji
    Jiang, Qinhong
    Zhao, Feng
    [J]. COMPUTER VISION, ECCV 2022, PT VIII, 2022, 13668 : 628 - 644
  • [10] Improving Deep Multi-modal 3D Object Detection for Autonomous Driving
    Khamsehashari, Razieh
    Schill, Kerstin
    [J]. 2021 7TH INTERNATIONAL CONFERENCE ON AUTOMATION, ROBOTICS AND APPLICATIONS (ICARA 2021), 2021, : 263 - 267