Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

被引:0
|
作者
Dai K. [1 ]
Chen S. [1 ]
机构
[1] School of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing
基金
中国国家自然科学基金;
关键词
Cross-modal feature; Deep interaction; Semantic segmentation; Thermal images;
D O I
10.1016/j.engappai.2024.108820
中图分类号
学科分类号
摘要
Semantic segmentation using RGB (Red-Green-Blue) images and thermal datas is an indispensable component of autonomous driving. The key to RGB-Thermal (RGB and Thermal) semantic segmentation is achieving the interaction and fusion of features between RGB and thermal images. Therefore, we propose a dual-branch deep cross-modal interaction network (DCIT) based on Encoder–Decoder structure. This framework consists of two parallel networks for feature extraction from RGB and Thermal data. Specifically, in each feature extraction stage of the Encoder, we design a Cross Feature Regulation Modules (CFRM) to align and correct modality specific features by reducing the inter-modality feature differences and eliminating intra-modality noise. Then, the modality features are aggregated through Cross Modal Feature Fusion Module (CMFFM) based on cross linear attention to capture global information from modality features. Finally, Adaptive Multi-Scale Cross-positional Fusion Module (AMCFM) utilizes the fused features to integrate consistent semantic information in the Decoder stage. Our framework can improve the interaction of cross modal features. Extensive experiments on urban scene datasets demonstrate that our proposed framework outperforms other RGB-Thermal semantic segmentation methods in terms of objective metrics and subjective visual assessments. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [31] Deep Semantic Mapping for Cross-Modal Retrieval
    Wang, Cheng
    Yang, Haojin
    Meinel, Christoph
    2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
  • [32] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
    Zaipeng Duan
    Xiao Huang
    Jie Ma
    Neural Processing Letters, 2023, 55 : 6361 - 6375
  • [33] Cross-modal hashing with semantic deep embedding
    Yan, Cheng
    Bai, Xiao
    Wang, Shuai
    Zhou, Jun
    Hancock, Edwin R.
    NEUROCOMPUTING, 2019, 337 : 58 - 66
  • [34] Lightweight dual-branch network for vehicle exhausts segmentation
    Sheng, Chiyun
    Hu, Bin
    Meng, Fanjun
    Yin, Dong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (12) : 17785 - 17806
  • [35] A Dual-Branch Fusion Network for Surgical Instrument Segmentation
    Zhengzhou University, School Of Electrical And Information Engineering, Zhengzhou, Henan
    450001, China
    不详
    100190, China
    IEEE Trans. Med. Rob. Bion., 4 (1542-1554): : 1542 - 1554
  • [36] Dual-Branch Network for Cloud and Cloud Shadow Segmentation
    Lu, Chen
    Xia, Min
    Qian, Ming
    Chen, Binyu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [37] Dual-branch residual network for lung nodule segmentation
    Cao, Haichao
    Liu, Hong
    Song, Enmin
    Hung, Chih-Cheng
    Ma, Guangzhi
    Xu, Xiangyang
    Jin, Renchao
    Lu, Jianguo
    APPLIED SOFT COMPUTING, 2020, 86
  • [38] Lightweight dual-branch network for vehicle exhausts segmentation
    Chiyun Sheng
    Bin Hu
    Fanjun Meng
    Dong Yin
    Multimedia Tools and Applications, 2021, 80 : 17785 - 17806
  • [39] DANet: Dual-Branch Activation Network for Small Object Instance Segmentation of Ship Images
    Sun, Yuxin
    Su, Li
    Yuan, Shouzheng
    Meng, Hao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6708 - 6720
  • [40] Dual-branch image projection network for geographic atrophy segmentation in retinal OCT images
    Xiaoming Liu
    Jieyang Li
    Ying Zhang
    Junping Yao
    Scientific Reports, 15 (1)