Cross-modal image fusion guided by subjective visual attention

被引:16
|
作者
Fang, Aiqing [1 ]
Zhao, Xinbo [1 ]
Zhang, Yanning [1 ]
机构
[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Xian 710072, Peoples R China
基金
中国国家自然科学基金;
关键词
Image fusion; Subjective attention; Top-down subjective task; Multi-task auxiliary learning; Deep learning; QUALITY ASSESSMENT; PERFORMANCE; FRAMEWORK;
D O I
10.1016/j.neucom.2020.07.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The human visual perception system has very strong robustness and contextual awareness in a variety of image processing tasks. This robustness and the perception ability of contextual awareness is closely related to the characteristics of multi-task auxiliary learning and subjective attention of the human visual perception system. In order to improve the robustness and contextual awareness of image fusion tasks, we proposed a multi-task auxiliary learning image fusion method guided by subjective attention. The image fusion method effectively unifies the subjective task intention and prior knowledge of human brain. In order to achieve our proposed image fusion method, we first analyze the mechanism of multi-task auxiliary learning, build a multi-task auxiliary learning network. Secondly, based on the human visual attention perception mechanism, we introduce the human visual attention network guided by subjective tasks on the basis of the multi-task auxiliary learning network. The subjective intention is introduced by the subjective attention task model, so that the network can fuse images according to the subjective intention. Finally, in order to verify the superiority of our image fusion method, we carried out experiments on the combined vision system image data set, and the infrared and visible image data set for experimental verification. The experimental results demonstrate the superiority of our fusion method over state-of-arts in contextual awareness and robustness. (C) 2020 Elsevier B.V. All rights reserved.
引用
收藏
页码:333 / 345
页数:13
相关论文
共 50 条
  • [31] Visual question answering with attention transfer and a cross-modal gating mechanism
    Li, Wei
    Sun, Jianhui
    Liu, Ge
    Zhao, Linglan
    Fang, Xiangzhong
    [J]. PATTERN RECOGNITION LETTERS, 2020, 133 (133) : 334 - 340
  • [32] Temporal Cross-Modal Attention for Audio-Visual Event Localization
    Nagasaki, Yoshiki
    Hayashi, Masaki
    Kaneko, Naoshi
    Aoki, Yoshimitsu
    [J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2022, 88 (03): : 263 - 268
  • [33] The Neural Correlates of Visual and Auditory Cross-Modal Selective Attention in Aging
    Rienacker, Franziska
    Van Gerven, Pascal W. M.
    Jacobs, Heidi I. L.
    Eck, Judith
    Van Heugten, Caroline M.
    Guerreiro, Maria J. S.
    [J]. FRONTIERS IN AGING NEUROSCIENCE, 2020, 12
  • [34] Visual attention guided image fusion with sparse representation
    Yang, Bin
    Li, Shutao
    [J]. OPTIK, 2014, 125 (17): : 4881 - 4888
  • [35] Cross-Modal Attention-Guided Convolutional Network for Multi-modal Cardiac Segmentation
    Zhou, Ziqi
    Guo, Xinna
    Yang, Wanqi
    Shi, Yinghuan
    Zhou, Luping
    Wang, Lei
    Yang, Ming
    [J]. MACHINE LEARNING IN MEDICAL IMAGING (MLMI 2019), 2019, 11861 : 601 - 610
  • [36] VISUAL ATTENTION IN A VISUAL-HAPTIC, CROSS-MODAL MATCHING TASK IN CHILDREN AND ADULTS
    Cote, Carol Ann
    [J]. PERCEPTUAL AND MOTOR SKILLS, 2015, 120 (02) : 381 - 396
  • [37] Cross-Modal Self-Attention Network for Referring Image Segmentation
    Ye, Linwei
    Rochan, Mrigank
    Liu, Zhi
    Wang, Yang
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10494 - 10503
  • [38] Stacked cross-modal feature consolidation attention networks for image captioning
    Pourkeshavarz, Mozhgan
    Nabavi, Shahabedin
    Moghaddam, Mohsen Ebrahimi
    Shamsfard, Mehrnoush
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 12209 - 12233
  • [39] Stacked cross-modal feature consolidation attention networks for image captioning
    Mozhgan Pourkeshavarz
    Shahabedin Nabavi
    Mohsen Ebrahimi Moghaddam
    Mehrnoush Shamsfard
    [J]. Multimedia Tools and Applications, 2024, 83 : 12209 - 12233
  • [40] Cross-Modal Attention With Semantic Consistence for Image-Text Matching
    Xu, Xing
    Wang, Tan
    Yang, Yang
    Zuo, Lin
    Shen, Fumin
    Shen, Heng Tao
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5412 - 5425