Towards efficient multi-modal 3D object detection: Homogeneous sparse fuse network

被引:0
|
作者
Tang, Yingjuan [1 ]
He, Hongwen [1 ]
Wang, Yong [1 ]
Wu, Jingda [2 ]
机构
[1] Beijing Inst Technol, Sch Mech Engn, Beijing 100081, Peoples R China
[2] Nanyang Technol Univ, Sch Mech & Aerosp Engn, 50 Nanyang Ave, Singapore 639798, Singapore
关键词
Autonomous driving; 3D object detection; Multi-modal; Sparse convolutional networks; Point cloud and image fusion; Homogeneous fusion;
D O I
10.1016/j.eswa.2024.124945
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR-only 3D detection methods struggle with the sparsity of point clouds. To overcome this issue, multi- modal methods have been proposed, but their fusion is a challenge due to the heterogeneous representation of images and point clouds. This paper proposes a novel multi-modal framework, Homogeneous Sparse Fusion (HS-Fusion), which generates pseudo point clouds from depth completion. The proposed framework introduces a 3D foreground-aware middle extractor that efficiently extracts high-responding foreground features from sparse point cloud data. This module can be integrated into existing sparse convolutional neural networks. Furthermore, the proposed homogeneous attentive fusion enables cross-modality consistency fusion. Finally, the proposed HS-Fusion can simultaneously combine 2D image features and 3D geometric features of pseudo point clouds using multi-representation feature extraction. The proposed network has been found to attain better performance on the 3D object detection benchmarks. In particular, the proposed model demonstrates a 4.02% improvement in accuracy compared to the pure model. Moreover, its inference speed surpasses that of other models, thus further validating the efficacy of HS-Fusion.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] A Multi-Modal Fusion-Based 3D Multi-Object Tracking Framework With Joint Detection
    Wang, Xiyang
    Fu, Chunyun
    He, Jiawei
    Huang, Mingguang
    Meng, Ting
    Zhang, Siyu
    Zhou, Hangning
    Xu, Ziyao
    Zhang, Chi
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (01): : 532 - 539
  • [42] Multi-modal Data Analysis and Fusion for Robust Object Detection in 2D/3D Sensing
    Schierl, Jonathan
    Graehling, Quinn
    Aspiras, Theus
    Asari, Vijay
    Van Rynbach, Andre
    Rabb, Dave
    2020 IEEE APPLIED IMAGERY PATTERN RECOGNITION WORKSHOP (AIPR): TRUSTED COMPUTING, PRIVACY, AND SECURING MULTIMEDIA, 2020,
  • [43] Cross Diffusion on Multi-hypergraph for Multi-modal 3D Object Recognition
    Zhang, Zizhao
    Lin, Haojie
    Zhu, Junjie
    Zhao, Xibin
    Gao, Yue
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING, PT I, 2018, 11164 : 38 - 49
  • [44] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
    Chiu, Hsu-kuang
    Lie, Jie
    Ambrus, Rares
    Bohg, Jeannette
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
  • [45] Exploiting Multi-Modal Synergies for Enhancing 3D Multi-Object Tracking
    Xu, Xinglong
    Ren, Weihong
    Chen, Xi'ai
    Fan, Huijie
    Han, Zhi
    Liu, Honghai
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (10): : 8643 - 8650
  • [46] GraphAlign: Enhancing Accurate Feature Alignment by Graph matching for Multi-Modal 3D Object Detection
    Song, Ziying
    Wei, Haiyue
    Bai, Lin
    Yang, Lei
    Jia, Caiyan
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION, ICCV, 2023, : 3335 - 3346
  • [47] Generating Adversarial Point Clouds on Multi-modal Fusion Based 3D Object Detection Model
    Wang, Huiying
    Shen, Huixin
    Zhang, Boyang
    Wen, Yu
    Meng, Dan
    INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2021), PT I, 2021, 12918 : 187 - 203
  • [48] Enhancing Multi-modal Features Using Local Self-attention for 3D Object Detection
    Li, Hao
    Zhang, Zehan
    Zhao, Xian
    Wang, Yulong
    Shen, Yuxi
    Pu, Shiliang
    Mao, Hui
    COMPUTER VISION, ECCV 2022, PT X, 2022, 13670 : 532 - 549
  • [49] PPF-Det: Point-Pixel Fusion for Multi-Modal 3D Object Detection
    Xie, Guotao
    Chen, Zhiyuan
    Gao, Ming
    Hu, Manjiang
    Qin, Xiaohui
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (06) : 5598 - 5611
  • [50] Sparse Fuse Dense: Towards High Quality 3D Detection with Depth Completion
    Wu, Xiaopei
    Peng, Liang
    Yang, Honghui
    Xie, Liang
    Huang, Chenxi
    Deng, Chengqi
    Liu, Haifeng
    Cai, Deng
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 5408 - 5417