Deep multimodal fusion for semantic image segmentation: A survey

被引:100
|
作者
Zhang, Yifei [1 ]
Sidibe, Desire [2 ]
Morel, Olivier [1 ]
Meriaudeau, Fabrice [1 ]
机构
[1] Univ Bourgogne Franche Comte, ImViA, VIBOT ERL CNRS 6000, F-71200 Le Creusot, France
[2] Univ Paris Saclay, Univ Evry, IBISC, F-91020 Evry, France
关键词
Image fusion; Multi-modal; Deep learning; Semantic segmentation; NEURAL-NETWORKS; RGB-D; POLARIZATION; RECOGNITION; VISION;
D O I
10.1016/j.imavis.2020.104042
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent advances in deep learning have shown excellent performance in various scene understanding tasks. However, in some complex environments or under challenging conditions, it is necessary to employ multiple modalities that provide complementary information on the same scene. A variety of studies have demonstrated that deep multimodal fusion for semantic image segmentation achieves significant performance improvement. These fusion approaches take the benefits of multiple information sources and generate an optimal joint prediction automatically. This paper describes the essential background concepts of deep multimodal fusion and the relevant applications in computer vision. In particular, we provide a systematic survey of multimodal fusion methodologies, multimodal segmentation datasets, and quantitative evaluations on the benchmark datasets. Existing fusion methods are summarized according to a common taxonomy: early fusion, late fusion, and hybrid fusion. Based on their performance, we analyze the strengths and weaknesses of different fusion strategies. Current challenges and design choices are discussed, aiming to provide the reader with a comprehensive and heuristic view of deep multimodal image segmentation. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Image semantic segmentation of indoor scenes: A survey
    Velastegui, Ronny
    Tatarchenko, Maxim
    Karaoglu, Sezer
    Gevers, Theo
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2024, 248
  • [22] Deep Dual Learning for Semantic Image Segmentation
    Luo, Ping
    Wang, Guangrun
    Lin, Liang
    Wang, Xiaogang
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2737 - 2745
  • [23] Semantic Image Segmentation With Propagating Deep Aggregation
    Ji, Jian
    Li, Sitong
    Xiong, Jian
    Chen, Peng
    Miao, Qiguang
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2020, 69 (12) : 9732 - 9742
  • [24] A Survey on Deep Learning for Multimodal Data Fusion
    Gao, Jing
    Li, Peng
    Chen, Zhikui
    Zhang, Jianing
    [J]. NEURAL COMPUTATION, 2020, 32 (05) : 829 - 864
  • [25] Semantic Guided Deep Unsupervised Image Segmentation
    Saha, Sudipan
    Sudhakaran, Swathikiran
    Banerjee, Biplab
    Pendurkar, Sumedh
    [J]. IMAGE ANALYSIS AND PROCESSING - ICIAP 2019, PT II, 2019, 11752 : 499 - 510
  • [26] Image Classification and Semantic Segmentation with Deep Learning
    Quazi, Saiman
    Musa, Sarhan M.
    [J]. 6TH IEEE INTERNATIONAL CONFERENCE ON RECENT ADVANCES AND INNOVATIONS IN ENGINEERING (ICRAIE), 2021,
  • [27] Real-Time Semantic Image Segmentation with Deep Learning for Autonomous Driving: A Survey
    Papadeas, Ilias
    Tsochatzidis, Lazaros
    Amanatiadis, Angelos
    Pratikakis, Ioannis
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (19):
  • [28] Structured Multimodal Fusion Network for Referring Image Segmentation
    Xue, Mingcheng
    Liu, Yu
    Xu, Kaiping
    Zhang, Haiyang
    Yu, Chengyang
    [J]. PROCEEDINGS OF THE 2022 INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, ICMI 2022, 2022, : 36 - 47
  • [29] Multimodal deep fusion for image question answering
    Zhang, Weifeng
    Yu, Jing
    Wang, Yuxia
    Wang, Wei
    [J]. KNOWLEDGE-BASED SYSTEMS, 2021, 212
  • [30] Infrared and Visible Image Fusion Based on Semantic Segmentation
    Zhou, Huabing
    Hou, Jilei
    Wu, Wei
    Zhang, Yanduo
    Wu, Yuntao
    Ma, Jiayi
    [J]. Jisuanji Yanjiu yu Fazhan/Computer Research and Development, 2021, 58 (02): : 436 - 443