Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

被引:0
|
作者
Dai K. [1 ]
Chen S. [1 ]
机构
[1] School of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing
基金
中国国家自然科学基金;
关键词
Cross-modal feature; Deep interaction; Semantic segmentation; Thermal images;
D O I
10.1016/j.engappai.2024.108820
中图分类号
学科分类号
摘要
Semantic segmentation using RGB (Red-Green-Blue) images and thermal datas is an indispensable component of autonomous driving. The key to RGB-Thermal (RGB and Thermal) semantic segmentation is achieving the interaction and fusion of features between RGB and thermal images. Therefore, we propose a dual-branch deep cross-modal interaction network (DCIT) based on Encoder–Decoder structure. This framework consists of two parallel networks for feature extraction from RGB and Thermal data. Specifically, in each feature extraction stage of the Encoder, we design a Cross Feature Regulation Modules (CFRM) to align and correct modality specific features by reducing the inter-modality feature differences and eliminating intra-modality noise. Then, the modality features are aggregated through Cross Modal Feature Fusion Module (CMFFM) based on cross linear attention to capture global information from modality features. Finally, Adaptive Multi-Scale Cross-positional Fusion Module (AMCFM) utilizes the fused features to integrate consistent semantic information in the Decoder stage. Our framework can improve the interaction of cross modal features. Extensive experiments on urban scene datasets demonstrate that our proposed framework outperforms other RGB-Thermal semantic segmentation methods in terms of objective metrics and subjective visual assessments. © 2024 Elsevier Ltd
引用
收藏
相关论文
共 50 条
  • [1] DGCBG-Net: A dual-branch network with global cross-modal interaction and boundary guidance for tumor segmentation in PET/CT images
    Zou, Ziwei
    Zou, Beiji
    Kui, Xiaoyan
    Chen, Zhi
    Li, Yang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2024, 250
  • [2] CalibNet: Dual-Branch Cross-Modal Calibration for RGB-D Salient Instance Segmentation
    Pei, Jialun
    Jiang, Tao
    Tang, He
    Liu, Nian
    Jin, Yueming
    Fan, Deng-Ping
    Heng, Pheng-Ann
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 4348 - 4362
  • [3] DBDAN: Dual-Branch Dynamic Attention Network for Semantic Segmentation of Remote Sensing Images
    Che, Rui
    Ma, Xiaowen
    Hong, Tingfeng
    Wang, Xinyu
    Feng, Tian
    Zhang, Wei
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT IV, 2024, 14428 : 306 - 317
  • [4] DHRNet: A Dual-Branch Hybrid Reinforcement Network for Semantic Segmentation of Remote Sensing Images
    Bai, Qinyan
    Luo, Xiaobo
    Wang, Yaxu
    Wei, Tengfei
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 4176 - 4193
  • [5] Semantic Guidance Fusion Network for Cross-Modal Semantic Segmentation
    Zhang, Pan
    Chen, Ming
    Gao, Meng
    SENSORS, 2024, 24 (08)
  • [6] Y-Net: Dual-branch Joint Network for Semantic Segmentation
    Chen, Yizhen
    Hu, Haifeng
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2021, 17 (04)
  • [7] PSR-Net: A Dual-Branch Pyramid Semantic Reasoning Network for Segmentation of Remote Sensing Images
    Wang, Lijun
    Li, Bicao
    Wang, Bei
    Li, Chunlei
    Huang, Jie
    Song, Mengxing
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT II, 2023, 14255 : 582 - 592
  • [8] Dual-branch hybrid network for lesion segmentation in gastric cancer images
    He, Dongzhi
    Zhang, Yuanyu
    Huang, Hui
    Si, Yuhang
    Wang, Zhiqiang
    Li, Yunqi
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [9] Dual-branch hybrid network for lesion segmentation in gastric cancer images
    Dongzhi He
    Yuanyu Zhang
    Hui Huang
    Yuhang Si
    Zhiqiang Wang
    Yunqi Li
    Scientific Reports, 13
  • [10] Deep semantic hashing with dual attention for cross-modal retrieval
    Jiagao Wu
    Weiwei Weng
    Junxia Fu
    Linfeng Liu
    Bin Hu
    Neural Computing and Applications, 2022, 34 : 5397 - 5416