MSL-CCRN: Multi-stage self-supervised learning based cross-modality contrastive representation network for infrared and visible image fusion

被引：0

作者：

Yan, Zhilin ^{[1
]}

Nie, Rencan ^{[1
]}

Cao, Jinde ^{[2
,3
]}

Xie, Guangxu ^{[1
]}

Ding, Zhengze ^{[1
]}

机构：

[1] Yunnan Univ, Sch Informat Sci & Engn, Kunming 650500, Peoples R China

[2] Southeast Univ, Sch Math, Nanjing 211189, Peoples R China

[3] Ahlia Univ, Manama 10878, Bahrain

来源：

DIGITAL SIGNAL PROCESSING | 2025年 / 156卷

基金：

中国博士后科学基金;

关键词：

Contrastive representation network; Image fusion; Multi-stage; Contrastive learning; Self-supervised; MULTISCALE TRANSFORM; FRAMEWORK; WAVELET; NEST;

D O I：

10.1016/j.dsp.2024.104853

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Infrared and visible image fusion (IVIF) facing different information in two modal scenarios, the focus of research is to better extract different information. In this work, we propose a multi-stage self-supervised learning based cross-modality contrastive representation network for infrared and visible image fusion (MSL-CCRN). Firstly, considering that the scene differences between different modalities affect the fusion of cross-modal images, we propose a contrastive representation network (CRN). CRN enhances the interaction between the fused image and the source image, and significantly improves the similarity between the meaningful features in each modality and the fused image. Secondly, due to the lack of ground truth in IVIF, the quality of directly obtained fused image is seriously affected. We design a multi-stage fusion strategy to address the loss of important information in this process. Notably, our method is a self-supervised network. In fusion stage I, we reconstruct the initial fused image as the new view of fusion stage II. In fusion stage II, we use the fused image obtained in the previous stage to carry out three-view contrastive representation, thereby constraining the feature extraction of the source image. This makes the final fused image introduce more important information in the source image. Through a large number of qualitative, quantitative experiments and downstream object detection experiments, our propose method shows excellent performance compared with most advanced methods.

引用

页数：14

共 43 条

[41] Self-Supervised Deep Multi-Level Representation Learning Fusion-Based Maximum Entropy Subspace Clustering for Hyperspectral Band Selection
Wang, Yulei
Ma, Haipeng
Yang, Yuchao
Zhao, Enyu
Song, Meiping
Yu, Chunyan
REMOTE SENSING, 2024, 16 (02)
[42] Self-supervised learning-based Multi-Scale feature Fusion Network for survival analysis from whole slide images
Li, Le
Liang, Yong
Shao, Mingwen
Lu, Shanghui
Liao, Shuilin
Ouyang, Dong
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 153
[43] CEFusion: An Infrared and Visible Image Fusion Network Based on Cross-Modal Multi-Granularity Information Interaction and Edge Guidance
Yang, Bin
Hu, Yuxuan
Liu, Xiaowen
Li, Jing
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024, 25 (11) : 17794 - 17809

← 1 2 3 4 5 →