MPCFusion: Multi-scale parallel cross fusion for infrared and visible images via convolution and vision Transformer

被引:7
|
作者
Tang, Haojie [1 ]
Qian, Yao [1 ]
Xing, Mengliang [1 ]
Cao, Yisheng [1 ]
Liu, Gang [1 ]
机构
[1] Shanghai Univ Elect Power, Sch Automat Engn, Shanghai 200090, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Vision Transformer; Convolution; Multi-scale feature; Infrared; NETWORK;
D O I
10.1016/j.optlaseng.2024.108094
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
The image fusion community is thriving with the wave of deep learning, and the most popular fusion methods are usually built upon well -designed network structures. However, most of the current methods do not fully exploit deeper features while ignore the importance of long-range dependencies. In this paper, a convolution and vision Transformer -based multi -scale parallel cross fusion network for infrared and visible images is proposed (MPCFusion). To exploit deeper texture details, a feature extraction module based on convolution and vision Transformer is designed. With a view to correlating the shallow features between different modalities, a parallel cross -attention module is proposed, in which a parallel -channel model efficiently preserves the proprietary modal features, followed by a cross -spatial model that ensures the information interactions between the different modalities. Moreover, a cross -domain attention module based on convolution and vision Transformer is proposed to capturing long-range dependencies between in-depth features and effectively solves the problem of global context loss. Finally, a nest -connection based decoder is used for implementing feature reconstruction. In particular, we design a new texture -guided structural similarity loss function to drive the network to preserve more complete texture details. Extensive experimental results illustrate that MPCFusion shows excellent fusion performance and generalization capabilities. The source code will be released at https:// github .com /YQ -097 /MPCFusion.
引用
收藏
页数:13
相关论文
共 50 条
  • [21] Fusion of Infrared and Visible Images based on Multi-scale Edge-preserving Decomposition and Sparse Representation
    Rong, Chuanzhen
    Jia, Yongxing
    Yang, Yu
    Zhu, Ying
    Wang, Yuan
    2018 11TH INTERNATIONAL CONGRESS ON IMAGE AND SIGNAL PROCESSING, BIOMEDICAL ENGINEERING AND INFORMATICS (CISP-BMEI 2018), 2018,
  • [22] MMFuse: A multi-scale infrared and visible images fusion algorithm based on morphological reconstruction and membership filtering
    Zhao, Liangjun
    Yang, Hao
    Dong, Linlu
    Zheng, Liping
    Asiya, Manlike
    Zheng, Fengling
    IET IMAGE PROCESSING, 2023, 17 (04) : 1126 - 1148
  • [23] A multi-scale information integration framework for infrared and visible image fusion
    Yang, Guang
    Li, Jie
    Lei, Hanxiao
    Gao, Xinbo
    NEUROCOMPUTING, 2024, 600
  • [24] Prompt learning and multi-scale attention for infrared and visible image fusion
    Li, Yanan
    Ji, Qingtao
    Jiao, Shaokang
    INFRARED PHYSICS & TECHNOLOGY, 2025, 145
  • [25] Infrared and visible image fusion using multi-scale pyramid network
    Zuo, Fengyuan
    Huang, Yongdong
    Li, Qiufu
    Su, Weijian
    INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2022, 20 (05)
  • [26] MPCT: A medical image fusion method based on multi-scale pyramid convolution and Transformer
    Xu, Yi
    Wang, Zijie
    Wu, Shoucai
    Zhan, Xiongfei
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 101
  • [27] AFT: Adaptive Fusion Transformer for Visible and Infrared Images
    Chang, Zhihao
    Feng, Zhixi
    Yang, Shuyuan
    Gao, Quanwei
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 2077 - 2092
  • [28] Infrared and Visible Image Fusion Based on Multi-scale Network with Dual-channel Information Cross Fusion Block
    Yang, Yong
    Kong, Xiangkai
    Huang, Shuying
    Wan, Weiguo
    Liu, Jiaxiang
    Zhang, Wang
    2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [29] Multi-scale decomposition based fusion of infrared and visible image via total variation and saliency analysis
    Ma, Tao
    Ma, Jie
    Fang, Bin
    Hu, Fangyu
    Quan, Siwen
    Du, Huajun
    INFRARED PHYSICS & TECHNOLOGY, 2018, 92 : 154 - 162
  • [30] Multi-scale vision transformer classification model with self-supervised learning and dilated convolution
    Xing, Liping
    Jin, Hongmei
    Li, Hong-an
    Li, Zhanli
    COMPUTERS & ELECTRICAL ENGINEERING, 2022, 103