TransFusion: Multi-Modal Robust Fusion for 3D Object Detection in Foggy Weather Based on Spatial Vision Transformer

被引:0
|
作者
Zhang, Cheng [1 ]
Wang, Hai [1 ]
Cai, Yingfeng [2 ]
Chen, Long [2 ]
Li, Yicheng [2 ]
机构
[1] Jiangsu Univ, Sch Automot & Traff Engn, Zhenjiang 212013, Peoples R China
[2] Jiangsu Univ, Automot Engn Res Inst, Zhenjiang 212013, Peoples R China
基金
中国国家自然科学基金;
关键词
3D object detection; multi-modal data fusion; intelligent vehicle; attention mechanism; spatial vision transformer; temporal-spatial memory fusion; radar; LiDAR;
D O I
10.1109/TITS.2024.3420432
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
A practical approach to realizing the comprehensive perception of the surrounding environment is to use a multi-modal fusion method based on various types of vehicular sensors. In clear weather, the camera and LiDAR can provide high-resolution images and point clouds that can be utilized for 3D object detection. However, in foggy weather, the propagation of light is affected by the fog in the air. Consequently, both images and point clouds become distorted to varying degrees. Thus, it is challenging to implement accurate detection in adverse weather conditions. Compared to cameras and LiDAR, Radar possesses strong penetrating power and is not affected by fog. Therefore, this paper proposes a novel two-stage detection framework called "TransFusion", which leverages LiDAR and Radar fusion to solve the problem of environment perception in foggy weather. The proposed framework is composed of Multi-modal Rotate Region Proposal Network (MM-RRPN) and Multi-modal Refine Network (MM-RFN). Specifically, Spatial Vision Transformer (SVT) and Cross-Modal Attention Mechanism (CMAM) are introduced in the MM-RRPN to improve the robustness of the algorithm in foggy weather. Furthermore, Temporal-Spatial Memory Fusion (TSMF) module in MM-RFN is employed to fuse the spatial-temporal prior information. In addition, the Multi-branches Combination Loss function (MC-Loss) is designed to efficiently supervise the learning of the network. Extensive experiments were conducted on Oxford Radar RobotCar (ORR) dataset. The experimental results show that the proposed algorithm has excellent performance in both foggy and clear weather. Especially in foggy weather, the proposed TransFusion achieves 85.31mAP, outperforming all other competing approaches. The demo is available at: https://youtu.be/ ugjIYHLgn98.
引用
收藏
页码:10652 / 10666
页数:15
相关论文
共 50 条
  • [41] SimDistill: Simulated Multi-Modal Distillation for BEV 3D Object Detection
    Zhao, Haimei
    Zhang, Qiming
    Zhao, Shanshan
    Chen, Zhe
    Zhang, Jing
    Tao, Dacheng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7, 2024, : 7460 - 7468
  • [42] 3D shape recognition based on multi-modal information fusion
    Qi Liang
    Mengmeng Xiao
    Dan Song
    Multimedia Tools and Applications, 2021, 80 : 16173 - 16184
  • [43] 3D shape recognition based on multi-modal information fusion
    Liang, Qi
    Xiao, Mengmeng
    Song, Dan
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (11) : 16173 - 16184
  • [44] EPNet plus plus : Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection
    Liu, Zhe
    Huang, Tengteng
    Li, Bingling
    Chen, Xiwu
    Wang, Xi
    Bai, Xiang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (07) : 8324 - 8341
  • [45] Bridging the View Disparity Between Radar and Camera Features for Multi-Modal Fusion 3D Object Detection
    Zhou, Taohua
    Chen, Junjie
    Shi, Yining
    Jiang, Kun
    Yang, Mengmeng
    Yang, Diange
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2023, 8 (02): : 1523 - 1535
  • [46] MultiCorrupt: A Multi-Modal Robustness Dataset and Benchmark of LiDAR-Camera Fusion for 3D Object Detection
    Beemelmanns, Till
    Zhang, Quan
    Geller, Christian
    Eckstein, Lutz
    2024 35TH IEEE INTELLIGENT VEHICLES SYMPOSIUM, IEEE IV 2024, 2024, : 3255 - 3261
  • [47] Occlusion-guided multi-modal fusion for vehicle-infrastructure cooperative 3D object detection
    Chu, Huazhen
    Liu, Haizhuang
    Zhuo, Junbao
    Chen, Jiansheng
    Ma, Huimin
    PATTERN RECOGNITION, 2025, 157
  • [48] Camera and LiDAR analysis for 3D object detection in foggy weather conditions
    Nguyen Anh Minh Mai
    Duthon, Pierre
    Salmane, Pascal Housam
    Khoudour, Louahdi
    Crouzil, Alain
    Velastin, Sergio A.
    2022 12TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION SYSTEMS (ICPRS), 2022,
  • [49] Object detection based on multi-modal adaptive fusion using YOLOv3
    Sheikh, Aarfa Bano
    Baru, Apurva
    Desai, Sanjana Shinde
    Mangale, Supriya
    JOURNAL OF APPLIED REMOTE SENSING, 2022, 16 (02)
  • [50] Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
    Wang, Li
    Zhang, Xinyu
    Li, Jun
    Xv, Baowei
    Fu, Rong
    Chen, Haifeng
    Yang, Lei
    Jin, Dafeng
    Zhao, Lijun
    IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2023, 72 (05) : 5628 - 5641