Fully Sparse Fusion for 3D Object Detection

被引:1
|
作者
Li Y. [1 ]
Fan L. [1 ]
Liu Y. [1 ]
Huang Z. [2 ]
Chen Y. [3 ]
Wang N. [2 ]
Zhang Z. [1 ]
机构
[1] Center for Research on Intelligent Perception and Computing (CRIPAC), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA), Beijing
[2] TuSimple, Beijing
[3] Centre for Artificial Intelligence and Robotics, Hong Kong Institute of Science and Innovation, Chinese Academy of Sciences (HKISI CAS), Hong Kong
关键词
3D object detection; autonomous driving; Cameras; Detectors; Feature extraction; fully sparse architecture; Instance segmentation; Laser radar; long-range perception; multi-sensor fusion; Point cloud compression; Three-dimensional displays;
D O I
10.1109/TPAMI.2024.3392303
中图分类号
学科分类号
摘要
Currently prevalent multi-modal 3D detection methods rely on dense detectors that usually use dense Bird&#x0027;s-Eye-View (BEV) feature maps. However, the cost of such BEV feature maps is quadratic to the detection range, making it not scalable for long-range detection. Recently, LiDAR-only fully sparse architecture has been gaining attention for its high efficiency in long-range perception. In this paper, we study how to develop a multi-modal fully sparse detector. Specifically, our proposed detector integrates the well-studied 2D instance segmentation into the LiDAR side, which is parallel to the 3D instance segmentation part in the LiDAR-only baseline. The proposed instance-based fusion framework maintains full sparsity while overcoming the constraints associated with the LiDAR-only fully sparse detector. Our framework showcases state-of-the-art performance on the widely used nuScenes dataset, Waymo Open Dataset, and the long-range Argoverse 2 dataset. Notably, the inference speed of our proposed method under the long-range perception setting is 2.7&#x00D7; faster than that of other state-of-the-art multimodal 3D detection methods. Code is released at <uri>https://github.com/BraveGroup/FullySparseFusion</uri>. IEEE
引用
收藏
页码:1 / 15
页数:14
相关论文
共 50 条
  • [21] 3D object detection based on sparse convolution neural network and feature fusion for autonomous driving in smart cities
    Wang, Lei
    Fan, Xiaoyun
    Chen, Jiahao
    Cheng, Jun
    Tan, Jun
    Ma, Xiaoliang
    SUSTAINABLE CITIES AND SOCIETY, 2020, 54 (54)
  • [22] Multi-feature Fusion VoteNet for 3D Object Detection
    Wang, Zhoutao
    Xie, Qian
    Wei, Mingqiang
    Long, Kun
    Wang, Jun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2022, 18 (01)
  • [23] SGFNet: Segmentation Guided Fusion Network for 3D Object Detection
    Wang, Yunlong
    Jiang, Kun
    Wen, Tuopu
    Jiao, Xinyu
    Wijaya, Benny
    Miao, Jinyu
    Shi, Yining
    Fu, Zheng
    Yang, Mengmeng
    Yang, Diange
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (12) : 8239 - 8246
  • [24] FusionPainting: Multimodal Fusion with Adaptive Attention for 3D Object Detection
    Xu, Shaoqing
    Zhou, Dingfu
    Fang, Jin
    Yin, Junbo
    Bin, Zhou
    Zhang, Liangjun
    2021 IEEE INTELLIGENT TRANSPORTATION SYSTEMS CONFERENCE (ITSC), 2021, : 3047 - 3054
  • [25] Sensor Fusion for Joint 3D Object Detection and Semantic Segmentation
    Meyer, Gregory P.
    Charland, Jake
    Hegde, Darshan
    Laddha, Ankit
    Vallespi-Gonzalez, Carlos
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 1230 - 1237
  • [26] A LiDAR-Camera Fusion 3D Object Detection Algorithm
    Liu, Leyuan
    He, Jian
    Ren, Keyan
    Xiao, Zhonghua
    Hou, Yibin
    INFORMATION, 2022, 13 (04)
  • [27] 3D Multi-object Detection and Tracking with Sparse Stationary LiDAR
    Zhang, Meng
    Pan, Zhiyu
    Feng, Jianjiang
    Zhou, Jie
    PATTERN RECOGNITION AND COMPUTER VISION, PT I, 2021, 13019 : 16 - 28
  • [28] Monocular-GPS Fusion 3D object detection for UAVs
    Ren, Siyuan
    Zhao, Wenjie
    Zhang, Antong
    Zhang, Bo
    Han, Bo
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [29] SWFormer: Sparse Window Transformer for 3D Object Detection in Point Clouds
    Sun, Pei
    Tan, Mingxing
    Wang, Weiyue
    Liu, Chenxi
    Xia, Fei
    Leng, Zhaoqi
    Anguelov, Dragomir
    COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 426 - 442
  • [30] An improved dense-to-sparse cross-modal fusion network for 3D object detection in RGB-D images
    Chen, Yan
    Ni, Jianjun
    Tang, Guangyi
    Cao, Weidong
    Yang, Simon X.
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 12159 - 12184