MPCFusion: Multi-scale parallel cross fusion for infrared and visible images via convolution and vision Transformer

被引：7

作者：

Tang, Haojie ^{[1
]}

Qian, Yao ^{[1
]}

Xing, Mengliang ^{[1
]}

Cao, Yisheng ^{[1
]}

Liu, Gang ^{[1
]}

机构：

[1] Shanghai Univ Elect Power, Sch Automat Engn, Shanghai 200090, Peoples R China

来源：

OPTICS AND LASERS IN ENGINEERING | 2024年 / 176卷

基金：

中国国家自然科学基金;

关键词：

Image fusion; Vision Transformer; Convolution; Multi-scale feature; Infrared; NETWORK;

D O I：

10.1016/j.optlaseng.2024.108094

中图分类号：

O43 [光学];

学科分类号：

070207 ; 0803 ;

摘要：

The image fusion community is thriving with the wave of deep learning, and the most popular fusion methods are usually built upon well -designed network structures. However, most of the current methods do not fully exploit deeper features while ignore the importance of long-range dependencies. In this paper, a convolution and vision Transformer -based multi -scale parallel cross fusion network for infrared and visible images is proposed (MPCFusion). To exploit deeper texture details, a feature extraction module based on convolution and vision Transformer is designed. With a view to correlating the shallow features between different modalities, a parallel cross -attention module is proposed, in which a parallel -channel model efficiently preserves the proprietary modal features, followed by a cross -spatial model that ensures the information interactions between the different modalities. Moreover, a cross -domain attention module based on convolution and vision Transformer is proposed to capturing long-range dependencies between in-depth features and effectively solves the problem of global context loss. Finally, a nest -connection based decoder is used for implementing feature reconstruction. In particular, we design a new texture -guided structural similarity loss function to drive the network to preserve more complete texture details. Extensive experimental results illustrate that MPCFusion shows excellent fusion performance and generalization capabilities. The source code will be released at https:// github .com /YQ -097 /MPCFusion.

引用

页数：13

共 50 条

[1] Multi-scale Fusion of Stretched Infrared and Visible Images
Jia, Weibin
Song, Zhihuan
Li, Zhengguo
SENSORS, 2022, 22 (17)
[2] MFT: Multi-scale Fusion Transformer for Infrared and Visible Image Fusion
Zhang, Chen-Ming
Yuan, Chengbo
Luo, Yong
Zhou, Xin
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 485 - 496
[3] DeepFake detection with multi-scale convolution and vision transformer
Lin, Hao
Huang, Wenmin
Luo, Weiqi
Lu, Wei
DIGITAL SIGNAL PROCESSING, 2023, 134
[4] Infrared and visible images fusion based on improved multi-scale structural fusion
Long Z.
Deng Y.
Xie J.
Wang R.
Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (07): : 1101 - 1110
[5] Fusion of visible and infrared images based on multi-scale image enhancement
Sun, Ming-Chao
Zhang, Chong
Liu, Jing-Hong
Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2012, 42 (03): : 738 - 742
[6] An improved fusion algorithm for infrared and visible images based on multi-scale transform
Li, He
Liu, Lei
Huang, Wei
Yue, Chao
INFRARED PHYSICS & TECHNOLOGY, 2016, 74 : 28 - 37
[7] MGRCFusion: An infrared and visible image fusion network based on multi-scale group residual convolution
Zhu, Pan
Yin, Yufei
Zhou, Xinglin
OPTICS AND LASER TECHNOLOGY, 2025, 180
[8] Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion
Xu, Qian
Zheng, Yuan
IEEE ACCESS, 2024, 12 : 8359 - 8372
[9] LMHFusion: A lightweight multi-scale hierarchical dense fusion network for infrared and visible images
Liping Zhang
Zhengyu Guo
Delin Luo
Science China Technological Sciences, 2025, 68 (5)
[10] Data-efficient multi-scale fusion vision transformer
Tang, Hao
Liu, Dawei
Shen, Chengchao
PATTERN RECOGNITION, 2025, 161

← 1 2 3 4 5 →