A Multi-Task Convolutional Neural Network for Infrared and Visible Multi -Resolution Image Fusion

被引:0
|
作者
Zhu Wen-qing [1 ,2 ,3 ]
Zhang Ning [1 ,2 ,3 ]
Li Zheng [1 ,2 ,3 ]
Liu Peng [1 ,3 ]
Tang Xin-yi [1 ,3 ]
机构
[1] Chinese Acad Sci, Shanghai Inst Tech Phys, Shanghai 200083, Peoples R China
[2] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[3] Chinese Acad Sci, Key Lab Infrared Syst Detect & Imaging Technol, Shanghai 200083, Peoples R China
基金
中国国家自然科学基金;
关键词
Infrared and visible image fusion; Multi-resolution image fusion; Linear attention; Gradient loss; Infrared image super-resolution;
D O I
10.3964/j.issn.1000-0593(2023)01-0289-08
中图分类号
O433 [光谱学];
学科分类号
0703 ; 070302 ;
摘要
Infrared and visible image fusion have always been a research hotspot in the image field. Fusion technology can compensate for a single sensor's deficiency and provide good imaging pandation for image understanding and analysis. Due to the limitation of production technology and cost, the resolution of infrared detectors is much lower than that of visible detectors, which prevents practical usage to a great extent. A multi-task convolutional neural network framework combining infrared superresolution and image fusion tasks is proposed, which is applied to the infrared and visible multi-resolution image fusion. In terms of network structure, firstly, a dual-channel network is designed to extract infrared and visible features respectively, so that the resolution of each source image does not limit the proposed algorithm. Secondly, the feature up-sampling block is proposed, using the bilinear interpolation method to increase the number of pixels. Then the mapping relationship between pixel smooth space and high-frequency space is refined via a multilayer perceptron. Therefore, the infrared images can be presented on an arbitrary scale, where the training tasks are not provided. Furthermore, the linear self-attention mechanism is introduced into the network to learn the nonlinear relationship between feature space positions, suppress irrelevant information and enhance global information expression. In terms of the loss function, the gradient loss is proposed to retain the filter response with larger absolute values in the infrared and visible images and calculate the Frobenius norm between the value and the response value of the reconstructed fusion image. Thus, fusion images can be generated without ideal images as ground truth supervising network learning. Finally, the fused and high-resolution infrared images can be reconstructed simultaneously by optimizing the multi-task model under the combined action of gradient loss and pixel loss. The proposed approach is trained on the RoadScene dataset and compared with the other four related algorithms on the TNO dataset. In terms of subjective performance, the proposed method can input source images with the arbitrary resolution, and fusion images have prominent infrared targets and rich visible details. When the resolution of source images is quite different, the proposed method can still reconstruct high-resolution infrared images with clear features and has robust generalization. The objective performance is excellent in multiple evaluation metrics such as entropy, the sum of the correlations of differences and spatial frequency. Experimental results demonstrate that fusion images have a large amount of information, high information conversion rate and high clarity, which verifies the effectiveness of the proposed method.
引用
收藏
页码:289 / 296
页数:8
相关论文
共 14 条
  • [1] A new image quality metric for image fusion: The sum of the correlations of differences
    Aslantas, V.
    Bendes, E.
    [J]. AEU-INTERNATIONAL JOURNAL OF ELECTRONICS AND COMMUNICATIONS, 2015, 69 (12) : 160 - 166
  • [2] Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition
    Cui, Guangmang
    Feng, Huajun
    Xu, Zhihai
    Li, Qi
    Chen, Yueting
    [J]. OPTICS COMMUNICATIONS, 2015, 341 : 199 - 209
  • [3] Accelerating the Super-Resolution Convolutional Neural Network
    Dong, Chao
    Loy, Chen Change
    Tang, Xiaoou
    [J]. COMPUTER VISION - ECCV 2016, PT II, 2016, 9906 : 391 - 407
  • [4] Image quality measures and their performance
    Eskicioglu, AM
    Fisher, PS
    [J]. IEEE TRANSACTIONS ON COMMUNICATIONS, 1995, 43 (12) : 2959 - 2965
  • [5] Kingma DP, 2014, ADV NEUR IN, V27
  • [6] Different Input Resolutions and Arbitrary Output Resolution: A Meta Learning-Based Deep Framework for Infrared and Visible Image Fusion
    Li, Huafeng
    Cen, Yueliang
    Liu, Yu
    Chen, Xun
    Yu, Zhengtao
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 4070 - 4083
  • [7] NestFuse: An Infrared and Visible Image Fusion Architecture Based on Nest Connection and Spatial/Channel Attention Models
    Li, Hui
    Wu, Xiao-Jun
    Durrani, Tariq
    [J]. IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2020, 69 (12) : 9645 - 9656
  • [8] Image Fusion with Guided Filtering
    Li, Shutao
    Kang, Xudong
    Hu, Jianwen
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2013, 22 (07) : 2864 - 2875
  • [9] Simultaneous image fusion and denoising with adaptive sparse representation
    Liu, Yu
    Wang, Zengfu
    [J]. IET IMAGE PROCESSING, 2015, 9 (05) : 347 - 357
  • [10] Infrared and visible image fusion with convolutional neural networks
    Liu, Yu
    Chen, Xun
    Cheng, Juan
    Peng, Hu
    Wang, Zengfu
    [J]. INTERNATIONAL JOURNAL OF WAVELETS MULTIRESOLUTION AND INFORMATION PROCESSING, 2018, 16 (03)