MSeg3D: Multi-modal 3D Semantic Segmentation for Autonomous Driving

被引:38
|
作者
Li, Jiale [1 ]
Dai, Hang [2 ]
Han, Hao [3 ]
Ding, Yong [3 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou, Peoples R China
[2] Univ Glasgow, Sch Comp Sci, Glasgow, Lanark, Scotland
[3] Zhejiang Univ, Sch Micronano Elect, Hangzhou, Peoples R China
关键词
REPRESENTATION;
D O I
10.1109/CVPR52729.2023.02078
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
LiDAR and camera are two modalities available for 3D semantic segmentation in autonomous driving. The popular LiDAR-only methods severely suffer from inferior segmentation on small and distant objects due to insufficient laser points, while the robust multi-modal solution is under-explored, where we investigate three crucial inherent difficulties: modality heterogeneity, limited sensor field of view intersection, and multi-modal data augmentation. We propose a multi-modal 3D semantic segmentation model (MSeg3D) with joint intra-modal feature extraction and inter-modal feature fusion to mitigate the modality heterogeneity. The multi-modal fusion in MSeg3D consists of geometry-based feature fusion GF-Phase, cross-modal feature completion, and semantic-based feature fusion SF-Phase on all visible points. The multi-modal data augmentation is reinvigorated by applying asymmetric transformations on LiDAR point cloud and multi-camera images individually, which benefits the model training with diversified augmentation transformations. MSeg3D achieves state-of-the-art results on nuScenes, Waymo, and SemanticKITTI datasets. Under the malfunctioning multi-camera input and the multi-frame point clouds input, MSeg3D still shows robustness and improves the LiDARonly baseline. Our code is publicly available at https://github.com/jialeli1/lidarseg3d.
引用
收藏
页码:21694 / 21704
页数:11
相关论文
共 50 条
  • [31] DenseFuseNet: Improve 3D Semantic Segmentation in the Context of Autonomous Driving with Dense Correspondence
    Lu, Yulun
    2021 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS AND COMPUTER ENGINEERING (ICCECE), 2021, : 259 - 270
  • [32] Artifacts Mapping: Multi-Modal Semantic Mapping for Object Detection and 3D Localization
    Rollo, Federico
    Raiola, Gennaro
    Zunino, Andrea
    Tsagarakis, Nikolaos
    Ajoudani, Arash
    2023 EUROPEAN CONFERENCE ON MOBILE ROBOTS, ECMR, 2023, : 90 - 97
  • [33] OmniViewer: Multi-modal Monoscopic 3D DASH
    Gao, Zhenhuan
    Chen, Shannon
    Nahrstedt, Klara
    2015 IEEE INTERNATIONAL SYMPOSIUM ON MULTIMEDIA (ISM), 2015, : 449 - 452
  • [34] Multi-Modal Streaming 3D Object Detection
    Abdelfattah, Mazen
    Yuan, Kaiwen
    Wang, Z. Jane
    Ward, Rabab
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2023, 8 (10) : 6163 - 6170
  • [35] Automated multi-modal Transformer network (AMTNet) for 3D medical images segmentation
    Zheng, Shenhai
    Tan, Jiaxin
    Jiang, Chuangbo
    Li, Laquan
    PHYSICS IN MEDICINE AND BIOLOGY, 2023, 68 (02):
  • [36] Learning Dynamic Convolutions for Multi-modal 3D MRI Brain Tumor Segmentation
    Yang, Qiushi
    Yuan, Yixuan
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES (BRAINLES 2020), PT II, 2021, 12659 : 441 - 451
  • [37] LinkNet: 2D-3D linked multi-modal network for online semantic segmentation of RGB-D videos
    Cai, Jun-Xiong
    Mu, Tai-Jiang
    Lai, Yu-Kun
    Hu, Shi-Min
    COMPUTERS & GRAPHICS-UK, 2021, 98 : 37 - 47
  • [38] Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges
    Feng, Di
    Haase-Schutz, Christian
    Rosenbaum, Lars
    Hertlein, Heinz
    Glaser, Claudius
    Timm, Fabian
    Wiesbeck, Werner
    Dietmayer, Klaus
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2021, 22 (03) : 1341 - 1360
  • [39] A Modular Framework for 2D/3D and Multi-modal Segmentation with Joint Super-Resolution
    Langmann, Benjamin
    Hartmann, Klaus
    Loffeld, Otmar
    COMPUTER VISION - ECCV 2012, PT II, 2012, 7584 : 12 - 21
  • [40] 3D Medical Multi-modal Segmentation Network Guided by Multi-source Correlation Constraint
    Zhou, Tongxue
    Canu, Stephane
    Vera, Pierre
    Ruan, Su
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 10243 - 10250