Dual-branch deep cross-modal interaction network for semantic segmentation with thermal images

被引：0

作者：

Dai K. ^{[1
]}

Chen S. ^{[1
]}

机构：

[1] School of Electronics and Information Engineering, Nanjing University of Information Science and Technology, Nanjing

来源：

Engineering Applications of Artificial Intelligence | 2024年 / 135卷

基金：

中国国家自然科学基金;

关键词：

Cross-modal feature; Deep interaction; Semantic segmentation; Thermal images;

D O I：

10.1016/j.engappai.2024.108820

中图分类号：

学科分类号：

摘要：

Semantic segmentation using RGB (Red-Green-Blue) images and thermal datas is an indispensable component of autonomous driving. The key to RGB-Thermal (RGB and Thermal) semantic segmentation is achieving the interaction and fusion of features between RGB and thermal images. Therefore, we propose a dual-branch deep cross-modal interaction network (DCIT) based on Encoder–Decoder structure. This framework consists of two parallel networks for feature extraction from RGB and Thermal data. Specifically, in each feature extraction stage of the Encoder, we design a Cross Feature Regulation Modules (CFRM) to align and correct modality specific features by reducing the inter-modality feature differences and eliminating intra-modality noise. Then, the modality features are aggregated through Cross Modal Feature Fusion Module (CMFFM) based on cross linear attention to capture global information from modality features. Finally, Adaptive Multi-Scale Cross-positional Fusion Module (AMCFM) utilizes the fused features to integrate consistent semantic information in the Decoder stage. Our framework can improve the interaction of cross modal features. Extensive experiments on urban scene datasets demonstrate that our proposed framework outperforms other RGB-Thermal semantic segmentation methods in terms of objective metrics and subjective visual assessments. © 2024 Elsevier Ltd

引用

共 50 条

[31] Deep Semantic Mapping for Cross-Modal Retrieval
Wang, Cheng
Yang, Haojin
Meinel, Christoph
2015 IEEE 27TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2015), 2015, : 234 - 241
[32] Transformer-Based Cross-Modal Information Fusion Network for Semantic Segmentation
Zaipeng Duan
Xiao Huang
Jie Ma
Neural Processing Letters, 2023, 55 : 6361 - 6375
[33] Cross-modal hashing with semantic deep embedding
Yan, Cheng
Bai, Xiao
Wang, Shuai
Zhou, Jun
Hancock, Edwin R.
NEUROCOMPUTING, 2019, 337 : 58 - 66
[34] Lightweight dual-branch network for vehicle exhausts segmentation
Sheng, Chiyun
Hu, Bin
Meng, Fanjun
Yin, Dong
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (12) : 17785 - 17806
[35] A Dual-Branch Fusion Network for Surgical Instrument Segmentation
Zhengzhou University, School Of Electrical And Information Engineering, Zhengzhou, Henan
450001, China
不详
100190, China
IEEE Trans. Med. Rob. Bion., 4 (1542-1554): : 1542 - 1554
[36] Dual-Branch Network for Cloud and Cloud Shadow Segmentation
Lu, Chen
Xia, Min
Qian, Ming
Chen, Binyu
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
[37] Dual-branch residual network for lung nodule segmentation
Cao, Haichao
Liu, Hong
Song, Enmin
Hung, Chih-Cheng
Ma, Guangzhi
Xu, Xiangyang
Jin, Renchao
Lu, Jianguo
APPLIED SOFT COMPUTING, 2020, 86
[38] Lightweight dual-branch network for vehicle exhausts segmentation
Chiyun Sheng
Bin Hu
Fanjun Meng
Dong Yin
Multimedia Tools and Applications, 2021, 80 : 17785 - 17806
[39] DANet: Dual-Branch Activation Network for Small Object Instance Segmentation of Ship Images
Sun, Yuxin
Su, Li
Yuan, Shouzheng
Meng, Hao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6708 - 6720
[40] Dual-branch image projection network for geographic atrophy segmentation in retinal OCT images
Xiaoming Liu
Jieyang Li
Ying Zhang
Junping Yao
Scientific Reports, 15 (1)

← 1 2 3 4 5 →