Multi-modal dataset and fusion network for simultaneous semantic segmentation of on-road dynamic objects

被引:0
|
作者
Cho, Jieun [1 ]
Ha, Jinsu [1 ]
Song, Hamin [1 ]
Jang, Sungmoon [2 ]
Jo, Kichun [3 ]
机构
[1] Konkuk Univ, Dept Smart Vehicle Engn, Seoul 05029, South Korea
[2] Hyundai Motor Co, Automot R&D Div, Seoul 06182, South Korea
[3] Hanyang Univ, Dept Automot Engn, Seoul 04763, South Korea
基金
新加坡国家研究基金会;
关键词
Semantic segmentation; Sensor fusion; Perception; Deep learning; Autonomous driving;
D O I
10.1016/j.engappai.2025.110024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An accurate and robust perception system is essential for autonomous vehicles to interact with various dynamic objects on the road. By applying semantic segmentation techniques to the data from the camera sensor and light detection and ranging sensor, dynamic objects can be classified at pixel and point levels respectively. However, there are challenges when using a single sensor, especially under adverse lighting conditions or with sparse point densities. To address these challenges, this paper proposes a network for simultaneous point cloud and image semantic segmentation based on sensor fusion. The proposed network adopts a modal- specific architecture to fully leverage the characteristics of sensor data and achieves geometrically accurate matching through the image, point, and voxel feature fusion module. Additionally, we introduce the dataset that provides semantic labels for synchronized images and point clouds. Experimental results show that the proposed fusion approach outperforms uni-modal based methods and demonstrates robust performance even in challenging real-world scenarios. The dataset is publicly available at https://github.com/ailab-konkuk/MultiModal-Dataset.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] EISNet: A Multi-Modal Fusion Network for Semantic Segmentation With Events and Images
    Xie, Bochen
    Deng, Yongjian
    Shao, Zhanpeng
    Li, Youfu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8639 - 8650
  • [2] Ticino: A multi-modal remote sensing dataset for semantic segmentation
    Barbato, Mirko Paolo
    Piccoli, Flavio
    Napoletano, Paolo
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [3] Application of Multi-modal Fusion Attention Mechanism in Semantic Segmentation
    Liu, Yunlong
    Yoshie, Osamu
    Watanabe, Hiroshi
    COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 378 - 397
  • [4] DFAMNet: dual fusion attention multi-modal network for semantic segmentation on LiDAR point clouds
    Mingjie Li
    Gaihua Wang
    Minghao Zhu
    Chunzheng Li
    Hong Liu
    Xuran Pan
    Qian Long
    Applied Intelligence, 2024, 54 : 3169 - 3180
  • [5] Multi-modal semantic image segmentation
    Pemasiri, Akila
    Kien Nguyen
    Sridharan, Sridha
    Fookes, Clinton
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 202
  • [6] DFAMNet: dual fusion attention multi-modal network for semantic segmentation on LiDAR point clouds
    Li, Mingjie
    Wang, Gaihua
    Zhu, Minghao
    Li, Chunzheng
    Liu, Hong
    Pan, Xuran
    Long, Qian
    APPLIED INTELLIGENCE, 2024, 54 (04) : 3169 - 3180
  • [7] TAG-fusion: Two-stage attention guided multi-modal fusion network for semantic segmentation
    Zhang, Zhizhou
    Wang, Wenwu
    Zhu, Lei
    Tang, Zhibin
    DIGITAL SIGNAL PROCESSING, 2025, 156
  • [8] Flexible Fusion Network for Multi-Modal Brain Tumor Segmentation
    Yang, Hengyi
    Zhou, Tao
    Zhou, Yi
    Zhang, Yizhe
    Fu, Huazhu
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (07) : 3349 - 3359
  • [9] MFMamba: A Mamba-Based Multi-Modal Fusion Network for Semantic Segmentation of Remote Sensing Images
    Wang, Yan
    Cao, Li
    Deng, He
    SENSORS, 2024, 24 (22)
  • [10] A Multi-Modal System for Road Detection and Segmentation
    Hu, Xiao
    Rodriguez F, Sergio A.
    Gepperth, Alexander
    2014 IEEE INTELLIGENT VEHICLES SYMPOSIUM PROCEEDINGS, 2014, : 1365 - 1370