WaveFusionNet: Infrared and visible image fusion based on multi-scale feature encoder-decoder and discrete wavelet decomposition

被引:0
|
作者
Liu, Renhe [1 ]
Liu, Yu [1 ]
Wang, Han [1 ]
Du, Shan [2 ]
机构
[1] Tianjin Univ, Sch Microelect, Tianjin 300072, Peoples R China
[2] Univ British Columbia, Dept Comp Sci Math Phys & Stat, Okanagan Campus, Kelowna, BC V1V 1V7, Canada
关键词
Infrared and visible image fusion; Frequency feature decomposition; Discrete wavelet transform; Multi-scale encoder; Dual-band feature fusion; QUALITY ASSESSMENT; TRANSFORM; FRAMEWORK; NETWORK; NEST;
D O I
10.1016/j.optcom.2024.131024
中图分类号
O43 [光学];
学科分类号
070207 ; 0803 ;
摘要
To merge complementary information from multimodal images, such as thermal saliency from infrared images and texture details from visible images, traditional multi-scale transform-based methods have been extensively studied, with deep learning-based methods gaining significant popularity in recent years. However, there has been limited research on optimally combining the advantages of these two categories in fusion. In this paper, we propose a novel infrared and visible image fusion (IVIF) framework, WaveFusionNet, which integrates precise frequency feature decomposition from the discrete wavelet transform (DWT) with the comprehensive feature extraction from the multi-scale encoder. Firstly, we train an encoder-decoder network for multi- scale feature extraction and image reconstruction. DWT is used for down-sampling with minimal information loss by decomposing extracted features into low and high-frequency sub-bands. Next, a dual-band feature fusion (DBFF) module is trained to merge these sub-bands by integrating a spatial feature transform-based sub-network for low-frequency fusion and a maximum absolute value selection strategy for fusing high- frequencies. Finally, all fused sub-bands are fed into the pre-trained decoder to reconstruct the final image. Experimental results on three benchmark datasets (TNO, Roadscene, and MSRS) demonstrate that the proposed fusion method outperforms recent IVIF methods in both quantitative assessment and visual perception while maintaining competitive time complexity.
引用
收藏
页数:15
相关论文
共 50 条
  • [31] MFT: Multi-scale Fusion Transformer for Infrared and Visible Image Fusion
    Zhang, Chen-Ming
    Yuan, Chengbo
    Luo, Yong
    Zhou, Xin
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2023, PT VI, 2023, 14259 : 485 - 496
  • [32] Multi-scale Supervised Attentive Encoder-Decoder Network for Crowd Counting
    Zhang, Anran
    Jiang, Xiaolong
    Zhang, Baochang
    Cao, Xianbin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (01)
  • [33] MMF: A Multi-scale MobileNet based fusion method for infrared and visible image
    Liu, Yi
    Miao, Changyun
    Ji, Jianhua
    Li, Xianguo
    INFRARED PHYSICS & TECHNOLOGY, 2021, 119
  • [34] An infrared and visible image fusion network based on multi-scale feature cascades and non-local attention
    Xu, Jing
    Liu, Zhenjin
    Fang, Ming
    IET IMAGE PROCESSING, 2024, 18 (08) : 2114 - 2125
  • [35] Infrared and Visible Image Fusion via Rolling Guidance Filtering and Hybrid Multi-Scale Decomposition
    Zhao Cheng
    Huang Yongdong
    LASER & OPTOELECTRONICS PROGRESS, 2019, 56 (14)
  • [36] A dual-encoder network based on multi-layer feature fusion for infrared and visible image fusion
    Huang, Shuying
    Wu, Xueqiang
    Yang, Yong
    Wan, Weiguo
    Wang, Xiaozheng
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024, 15 (10) : 4511 - 4520
  • [37] MULTI-STEP CHORD SEQUENCE PREDICTION BASED ON AGGREGATED MULTI-SCALE ENCODER-DECODER NETWORKS
    Carsault, Tristan
    McLeod, Andrew
    Esling, Philippe
    Nika, Jerome
    Nakamura, Eita
    Yoshii, Kazuyoshi
    2019 IEEE 29TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2019,
  • [38] Highly efficient encoder-decoder network based on multi-scale edge enhancement and dilated convolution for LDCT image denoising
    Jia, Lina
    He, Xu
    Huang, Aimin
    Jia, Beibei
    Wang, Xinfeng
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (8-9) : 6081 - 6091
  • [39] Building Extraction of Aerial Images by a Global and Multi-Scale Encoder-Decoder Network
    Ma, Jingjing
    Wu, Linlin
    Tang, Xu
    Liu, Fang
    Zhang, Xiangrong
    Jiao, Licheng
    REMOTE SENSING, 2020, 12 (15)
  • [40] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Hongbo Bi
    Huihui Zhu
    Lina Yang
    Ranwan Wu
    Pattern Recognition and Image Analysis, 2022, 32 : 340 - 350