Multi-modal semantic image segmentation

被引:11
|
作者
Pemasiri, Akila [1 ]
Kien Nguyen [1 ]
Sridharan, Sridha [1 ]
Fookes, Clinton [1 ]
机构
[1] Queensland Univ Technol, Image & Video Res Lab, 2 George St,GPO Box 2434, Brisbane, Qld 4001, Australia
基金
澳大利亚研究理事会;
关键词
Segmentation; X-ray; Mask R-CNN; Neural networks;
D O I
10.1016/j.cviu.2020.103085
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a modality invariant method to obtain high quality semantic object segmentation of human body parts, for four imaging modalities which consist of visible images, X-ray images, thermal images (heatmaps) and infrared radiation (IR) images. We first consider two modalities (i.e. visible and X-ray images) to develop an architecture suitable for multi-modal semantic segmentation. Due to the intrinsic difference between images from the two modalities, state-of-the-art approaches such as Mask R-CNN do not perform satisfactorily. Insights from analysing how the intermediate layers within Mask R-CNN work on both visible and X-ray modalities have led us to propose a new and efficient network architecture which yields highly accurate semantic segmentation results across both visible and X-ray domains. We design multi-task losses to train the network across different modalities. By conducing multiple experiments across visible and X-ray images of the human upper extremity, we validate the proposed approach, which outperforms the traditional Mask R-CNN method through better exploiting the output features of CNNs. Based on the insights gained on these images from visible and X-ray domains, we extend the proposed multi-modal semantic segmentation method to two additional modalities; (viz. heatmap and IR images). Experiments conducted on these two modalities, further confirm our architecture's capacity to improve the segmentation by exploiting the complementary information in the different modalities of the images. Our method can also be applied to include other modalities and can be effectively utilized for several tasks including medical image analysis tasks such as image registration and 3D reconstruction across modalities.
引用
收藏
页数:11
相关论文
共 50 条
  • [1] Multi-modal unsupervised domain adaptation for semantic image segmentation
    Hu, Sijie
    Bonardi, Fabien
    Bouchafa, Samia
    Sidibe, Desire
    [J]. PATTERN RECOGNITION, 2023, 137
  • [2] MULTI-MODAL SEMANTIC MESH SEGMENTATION IN URBAN SCENES
    Laupheimer, Dominik
    Haala, Norbert
    [J]. XXIV ISPRS CONGRESS IMAGING TODAY, FORESEEING TOMORROW, COMMISSION II, 2022, 5-2 : 267 - 274
  • [3] MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation
    Xin Lan
    Xiaojing Gu
    Xingsheng Gu
    [J]. Applied Intelligence, 2022, 52 : 5817 - 5829
  • [4] MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation
    Lan, Xin
    Gu, Xiaojing
    Gu, Xingsheng
    [J]. APPLIED INTELLIGENCE, 2022, 52 (05) : 5817 - 5829
  • [5] Application of Multi-modal Fusion Attention Mechanism in Semantic Segmentation
    Liu, Yunlong
    Yoshie, Osamu
    Watanabe, Hiroshi
    [J]. COMPUTER VISION - ACCV 2022, PT VII, 2023, 13847 : 378 - 397
  • [6] Multi-modal Prototypes for Open-World Semantic Segmentation
    Yang, Yuhuan
    Ma, Chaofan
    Ju, Chen
    Zhang, Fei
    Yao, Jiangchao
    Zhang, Ya
    Wang, Yanfeng
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
  • [7] Semantic Segmentation of Defects in Infrastructures through Multi-modal Images
    Shahsavarani, Sara
    Lopez, Fernando
    Ibarra-Castanedo, Clemente
    Maldague, Xavier P., V
    [J]. THERMOSENSE: THERMAL INFRARED APPLICATIONS XLVI, 2024, 13047
  • [8] Ticino: A multi-modal remote sensing dataset for semantic segmentation
    Barbato, Mirko Paolo
    Piccoli, Flavio
    Napoletano, Paolo
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 249
  • [9] Comprehensive Multi-Modal Interactions for Referring Image Segmentation
    Jain, Kanishk
    Gandhi, Vineet
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3427 - 3435
  • [10] Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency
    Yang, Jie
    Zhu, Ye
    Wang, Chaoqun
    Li, Zhen
    Zhang, Ruimao
    [J]. MEDICAL IMAGING WITH DEEP LEARNING, VOL 227, 2023, 227 : 1602 - 1622