Multi-Modal Multi-Task (3MT) Road Segmentation

被引:3
|
作者
Milli, Erkan [1 ]
Erkent, Ozgur [2 ]
Ylmaz, Asm Egemen [1 ]
机构
[1] Ankara Univ, Dept Elect & Elect Engn, TR-0600 Ankara, Turkiye
[2] Hacettepe Univ, Comp Sci Dept, TR-0600 Ankara, Turkiye
关键词
multi-task learning; road segmentation; sensor fusion;
D O I
10.1109/LRA.2023.3295254
中图分类号
TP24 [机器人技术];
学科分类号
080202 ; 1405 ;
摘要
Multi-modal systems have the capacity of producing more reliable results than systems with a single modality in road detection due to perceiving different aspects of the scene. We focus on using raw sensor inputs instead of, as it is typically done in many SOTA works, leveraging architectures that require high pre-processing costs such as surface normals or dense depth predictions. By using raw sensor inputs, we aim to utilize a low-cost model that minimizes both the pre-processing and model computation costs. This study presents a cost-effective and highly accurate solution for road segmentation by integrating data from multiple sensors within a multi-task learning architecture. A fusion architecture is proposed in which RGB and LiDAR depth images constitute the inputs of the network. Another contribution of this study is to use IMU/GNSS (inertial measurement unit/global navigation satellite system) inertial navigation system whose data is collected synchronously and calibrated with a LiDAR-camera to compute aggregated dense LiDAR depth images. It has been demonstrated by experiments on the KITTI dataset that the proposed method offers fast and high-performance solutions. We have also shown the performance of our method on Cityscapes where raw LiDAR data is not available. The segmentation results obtained for both full and half resolution images are competitive with existing methods. Therefore, we conclude that our method is not dependent only on raw LiDAR data; rather, it can be used with different sensor modalities. The inference times obtained in all experiments are very promising for real-time experiments.
引用
收藏
页码:5408 / 5415
页数:8
相关论文
共 50 条
  • [1] MultiMAE: Multi-modal Multi-task Masked Autoencoders
    Bachmann, Roman
    Mizrahi, David
    Atanov, Andrei
    Zamir, Amir
    [J]. COMPUTER VISION, ECCV 2022, PT XXXVII, 2022, 13697 : 348 - 367
  • [2] Multi-task Learning of Semantic Segmentation and Height Estimation for Multi-modal Remote Sensing Images
    Mengyu WANG
    Zhiyuan YAN
    Yingchao FENG
    Wenhui DIAO
    Xian SUN
    [J]. Journal of Geodesy and Geoinformation Science, 2023, 6 (04) : 27 - 39
  • [3] MULTI-MODAL MULTI-TASK LEARNING FOR SEMANTIC SEGMENTATION OF LAND COVER UNDER CLOUDY CONDITIONS
    Xu, Fang
    Shi, Yilei
    Yang, Wen
    Zhu, Xiaoxiang
    [J]. IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 6274 - 6277
  • [4] Multi-modal microblog classification via multi-task learning
    Sicheng Zhao
    Hongxun Yao
    Sendong Zhao
    Xuesong Jiang
    Xiaolei Jiang
    [J]. Multimedia Tools and Applications, 2016, 75 : 8921 - 8938
  • [5] MultiNet: Multi-Modal Multi-Task Learning for Autonomous Driving
    Chowdhuri, Sauhaarda
    Pankaj, Tushar
    Zipser, Karl
    [J]. 2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1496 - 1504
  • [6] A Multi-modal Multi-task based Approach for Movie Recommendation
    Raj, Subham
    Mondal, Prabir
    Chakder, Daipayan
    Saha, Sriparna
    Onoe, Naoyuki
    [J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [7] Multi-Modal Multi-Task Learning for Automatic Dietary Assessment
    Liu, Qi
    Zhang, Yue
    Liu, Zhenguang
    Yuan, Ye
    Cheng, Li
    Zimmermann, Roger
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 2347 - 2354
  • [8] Multi-modal multi-task feature fusion for RGBT tracking
    Cai, Yujue
    Sui, Xiubao
    Gu, Guohua
    [J]. INFORMATION FUSION, 2023, 97
  • [9] Multi-modal microblog classification via multi-task learning
    Zhao, Sicheng
    Yao, Hongxun
    Zhao, Sendong
    Jiang, Xuesong
    Jiang, Xiaolei
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (15) : 8921 - 8938
  • [10] Multi-task Multi-modal Models for Collective Anomaly Detection
    Ide, Tsuyoshi
    Phan, Dzung T.
    Kalagnanam, Jayant
    [J]. 2017 17TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2017, : 177 - 186