MPCFusion: Multi-scale parallel cross fusion for infrared and visible images via convolution and vision Transformer

被引:7
|
作者
Tang, Haojie [1 ]
Qian, Yao [1 ]
Xing, Mengliang [1 ]
Cao, Yisheng [1 ]
Liu, Gang [1 ]
机构
[1] Shanghai Univ Elect Power, Sch Automat Engn, Shanghai 200090, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Vision Transformer; Convolution; Multi-scale feature; Infrared; NETWORK;
D O I
10.1016/j.optlaseng.2024.108094
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
The image fusion community is thriving with the wave of deep learning, and the most popular fusion methods are usually built upon well -designed network structures. However, most of the current methods do not fully exploit deeper features while ignore the importance of long-range dependencies. In this paper, a convolution and vision Transformer -based multi -scale parallel cross fusion network for infrared and visible images is proposed (MPCFusion). To exploit deeper texture details, a feature extraction module based on convolution and vision Transformer is designed. With a view to correlating the shallow features between different modalities, a parallel cross -attention module is proposed, in which a parallel -channel model efficiently preserves the proprietary modal features, followed by a cross -spatial model that ensures the information interactions between the different modalities. Moreover, a cross -domain attention module based on convolution and vision Transformer is proposed to capturing long-range dependencies between in-depth features and effectively solves the problem of global context loss. Finally, a nest -connection based decoder is used for implementing feature reconstruction. In particular, we design a new texture -guided structural similarity loss function to drive the network to preserve more complete texture details. Extensive experimental results illustrate that MPCFusion shows excellent fusion performance and generalization capabilities. The source code will be released at https:// github .com /YQ -097 /MPCFusion.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Multi-scale Fusion of Stretched Infrared and Visible Images
    Jia, Weibin
    Song, Zhihuan
    Li, Zhengguo
    SENSORS, 2022, 22 (17)
  • [2] MFT: Multi-scale Fusion Transformer for Infrared and Visible Image Fusion
    Zhang, Chen-Ming
    Yuan, Chengbo
    Luo, Yong
    Zhou, Xin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 485 - 496
  • [3] DeepFake detection with multi-scale convolution and vision transformer
    Lin, Hao
    Huang, Wenmin
    Luo, Weiqi
    Lu, Wei
    DIGITAL SIGNAL PROCESSING, 2023, 134
  • [4] Infrared and visible images fusion based on improved multi-scale structural fusion
    Long Z.
    Deng Y.
    Xie J.
    Wang R.
    Guangxue Jingmi Gongcheng/Optics and Precision Engineering, 2024, 32 (07): : 1101 - 1110
  • [5] Fusion of visible and infrared images based on multi-scale image enhancement
    Sun, Ming-Chao
    Zhang, Chong
    Liu, Jing-Hong
    Jilin Daxue Xuebao (Gongxueban)/Journal of Jilin University (Engineering and Technology Edition), 2012, 42 (03): : 738 - 742
  • [6] An improved fusion algorithm for infrared and visible images based on multi-scale transform
    Li, He
    Liu, Lei
    Huang, Wei
    Yue, Chao
    INFRARED PHYSICS & TECHNOLOGY, 2016, 74 : 28 - 37
  • [7] MGRCFusion: An infrared and visible image fusion network based on multi-scale group residual convolution
    Zhu, Pan
    Yin, Yufei
    Zhou, Xinglin
    OPTICS AND LASER TECHNOLOGY, 2025, 180
  • [8] Integrating Parallel Attention Mechanisms and Multi-Scale Features for Infrared and Visible Image Fusion
    Xu, Qian
    Zheng, Yuan
    IEEE ACCESS, 2024, 12 : 8359 - 8372
  • [9] LMHFusion: A lightweight multi-scale hierarchical dense fusion network for infrared and visible images
    Liping Zhang
    Zhengyu Guo
    Delin Luo
    Science China Technological Sciences, 2025, 68 (5)
  • [10] Data-efficient multi-scale fusion vision transformer
    Tang, Hao
    Liu, Dawei
    Shen, Chengchao
    PATTERN RECOGNITION, 2025, 161