Cross-modal image fusion guided by subjective visual attention

被引:16
|
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Subjective attention; Top-down subjective task; Multi-task auxiliary learning; Deep learning; QUALITY ASSESSMENT; PERFORMANCE; FRAMEWORK;
D O I
10.1016/j.neucom.2020.07.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The human visual perception system has very strong robustness and contextual awareness in a variety of image processing tasks. This robustness and the perception ability of contextual awareness is closely related to the characteristics of multi-task auxiliary learning and subjective attention of the human visual perception system. In order to improve the robustness and contextual awareness of image fusion tasks, we proposed a multi-task auxiliary learning image fusion method guided by subjective attention. The image fusion method effectively unifies the subjective task intention and prior knowledge of human brain. In order to achieve our proposed image fusion method, we first analyze the mechanism of multi-task auxiliary learning, build a multi-task auxiliary learning network. Secondly, based on the human visual attention perception mechanism, we introduce the human visual attention network guided by subjective tasks on the basis of the multi-task auxiliary learning network. The subjective intention is introduced by the subjective attention task model, so that the network can fuse images according to the subjective intention. Finally, in order to verify the superiority of our image fusion method, we carried out experiments on the combined vision system image data set, and the infrared and visible image data set for experimental verification. The experimental results demonstrate the superiority of our fusion method over state-of-arts in contextual awareness and robustness. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:333 / 345
页数:13
相关论文
共 50 条
  • [1] Cross-modal attention guided visual reasoning for referring image segmentation
    Zhang, Wenjing
    Hu, Mengnan
    Tan, Quange
    Zhou, Qianli
    Wang, Rong
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (19) : 28853 - 28872
  • [2] Cross-modal attention guided visual reasoning for referring image segmentation
    Wenjing Zhang
    Mengnan Hu
    Quange Tan
    Qianli Zhou
    Rong Wang
    [J]. Multimedia Tools and Applications, 2023, 82 : 28853 - 28872
  • [3] Cross-modal orienting of visual attention
    Hillyard, Steven A.
    Stoermer, Viola S.
    Feng, Wenfeng
    Martinez, Antigona
    McDonald, John J.
    [J]. NEUROPSYCHOLOGIA, 2016, 83 : 170 - 178
  • [4] Cross-modal exogenous visual selective attention
    Zhao, C
    Yang, H
    Zhang, K
    [J]. INTERNATIONAL JOURNAL OF PSYCHOLOGY, 2000, 35 (3-4) : 100 - 100
  • [5] Cross-modal fusion for multi-label image classification with attention mechanism
    Wang, Yangtao
    Xie, Yanzhao
    Zeng, Jiangfeng
    Wang, Hanpin
    Fan, Lisheng
    Song, Yufan
    [J]. COMPUTERS & ELECTRICAL ENGINEERING, 2022, 101
  • [6] Cross-modal fusion for multi-label image classification with attention mechanism
    Wang, Yangtao
    Xie, Yanzhao
    Zeng, Jiangfeng
    Wang, Hanpin
    Fan, Lisheng
    Song, Yufan
    [J]. Computers and Electrical Engineering, 2022, 101
  • [7] CCAFusion: Cross-Modal Coordinate Attention Network for Infrared and Visible Image Fusion
    Li, Xiaoling
    Li, Yanfeng
    Chen, Houjin
    Peng, Yahui
    Pan, Pan
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (02) : 866 - 881
  • [8] Cross-modal attention for multi-modal image registration
    Song, Xinrui
    Chao, Hanqing
    Xu, Xuanang
    Guo, Hengtao
    Xu, Sheng
    Turkbey, Baris
    Wood, Bradford J.
    Sanford, Thomas
    Wang, Ge
    Yan, Pingkun
    [J]. MEDICAL IMAGE ANALYSIS, 2022, 82
  • [9] Cross-Modal Multistep Fusion Network With Co-Attention for Visual Question Answering
    Lao, Mingrui
    Guo, Yanming
    Wang, Hui
    Zhang, Xin
    [J]. IEEE ACCESS, 2018, 6 : 31516 - 31524
  • [10] Utilizing visual attention for cross-modal coreference interpretation
    Byron, D
    Mampilly, T
    Sharma, V
    Xu, TF
    [J]. MODELING AND USING CONTEXT, PROCEEDINGS, 2005, 3554 : 83 - 96