Multi-modal dataset and fusion network for simultaneous semantic segmentation of on-road dynamic objects

被引:0
|
作者
Cho, Jieun [1 ]
Ha, Jinsu [1 ]
Song, Hamin [1 ]
Jang, Sungmoon [2 ]
Jo, Kichun [3 ]
机构
[1] Konkuk Univ, Dept Smart Vehicle Engn, Seoul 05029, South Korea
[2] Hyundai Motor Co, Automot R&D Div, Seoul 06182, South Korea
[3] Hanyang Univ, Dept Automot Engn, Seoul 04763, South Korea
基金
新加坡国家研究基金会;
关键词
Semantic segmentation; Sensor fusion; Perception; Deep learning; Autonomous driving;
D O I
10.1016/j.engappai.2025.110024
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An accurate and robust perception system is essential for autonomous vehicles to interact with various dynamic objects on the road. By applying semantic segmentation techniques to the data from the camera sensor and light detection and ranging sensor, dynamic objects can be classified at pixel and point levels respectively. However, there are challenges when using a single sensor, especially under adverse lighting conditions or with sparse point densities. To address these challenges, this paper proposes a network for simultaneous point cloud and image semantic segmentation based on sensor fusion. The proposed network adopts a modal- specific architecture to fully leverage the characteristics of sensor data and achieves geometrically accurate matching through the image, point, and voxel feature fusion module. Additionally, we introduce the dataset that provides semantic labels for synchronized images and point clouds. Experimental results show that the proposed fusion approach outperforms uni-modal based methods and demonstrates robust performance even in challenging real-world scenarios. The dataset is publicly available at https://github.com/ailab-konkuk/MultiModal-Dataset.
引用
收藏
页数:11
相关论文
共 50 条
  • [31] Semantic Segmentation of Defects in Infrastructures through Multi-modal Images
    Shahsavarani, Sara
    Lopez, Fernando
    Ibarra-Castanedo, Clemente
    Maldague, Xavier P., V
    THERMOSENSE: THERMAL INFRARED APPLICATIONS XLVI, 2024, 13047
  • [32] A2FSeg: Adaptive Multi-modal Fusion Network for Medical Image Segmentation
    Wang, Zirui
    Hong, Yi
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT IV, 2023, 14223 : 673 - 681
  • [33] MidFusNet: Mid-dense Fusion Network for Multi-modal Brain MRI Segmentation
    Duan, Wenting
    Zhang, Lei
    Colman, Jordan
    Gulli, Giosue
    Ye, Xujiong
    BRAINLESION: GLIOMA, MULTIPLE SCLEROSIS, STROKE AND TRAUMATIC BRAIN INJURIES, BRAINLES 2022, 2023, 13769 : 102 - 114
  • [34] Self-Supervised Multi-Modal Hybrid Fusion Network for Brain Tumor Segmentation
    Fang, Feiyi
    Yao, Yazhou
    Zhou, Tao
    Xie, Guosen
    Lu, Jianfeng
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2022, 26 (11) : 5310 - 5320
  • [35] Dual-Attention Deep Fusion Network for Multi-modal Medical Image Segmentation
    Zheng, Shenhai
    Ye, Xin
    Tan, Jiaxin
    Yang, Yifei
    Li, Laquan
    FOURTEENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING, ICGIP 2022, 2022, 12705
  • [36] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [37] Based on Multi-Feature Information Attention Fusion for Multi-Modal Remote Sensing Image Semantic Segmentation
    Zhang, Chongyu
    2021 IEEE INTERNATIONAL CONFERENCE ON MECHATRONICS AND AUTOMATION (IEEE ICMA 2021), 2021, : 71 - 76
  • [38] Semantic Alignment Network for Multi-Modal Emotion Recognition
    Hou, Mixiao
    Zhang, Zheng
    Liu, Chang
    Lu, Guangming
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5318 - 5329
  • [39] A multi-modal and multi-stage fusion enhancement network for segmentation based on OCT and OCTA images
    Quan, Xiongwen
    Hou, Guangyao
    Yin, Wenya
    Zhang, Han
    INFORMATION FUSION, 2025, 113
  • [40] Adherent Peanut Image Segmentation Based on Multi-Modal Fusion
    Wang, Yujing
    Ye, Fang
    Zeng, Jiusun
    Cai, Jinhui
    Huang, Wangsen
    SENSORS, 2024, 24 (14)