Cross-modal image fusion guided by subjective visual attention

被引：16

作者：

Fang, Aiqing ^{[1
]}

Zhao, Xinbo ^{[1
]}

Zhang, Yanning ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big, Xian 710072, Peoples R China

来源：

NEUROCOMPUTING | 2020年 / 414卷 / 414期

基金：

中国国家自然科学基金;

关键词：

Image fusion; Subjective attention; Top-down subjective task; Multi-task auxiliary learning; Deep learning; QUALITY ASSESSMENT; PERFORMANCE; FRAMEWORK;

D O I：

10.1016/j.neucom.2020.07.014

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The human visual perception system has very strong robustness and contextual awareness in a variety of image processing tasks. This robustness and the perception ability of contextual awareness is closely related to the characteristics of multi-task auxiliary learning and subjective attention of the human visual perception system. In order to improve the robustness and contextual awareness of image fusion tasks, we proposed a multi-task auxiliary learning image fusion method guided by subjective attention. The image fusion method effectively unifies the subjective task intention and prior knowledge of human brain. In order to achieve our proposed image fusion method, we first analyze the mechanism of multi-task auxiliary learning, build a multi-task auxiliary learning network. Secondly, based on the human visual attention perception mechanism, we introduce the human visual attention network guided by subjective tasks on the basis of the multi-task auxiliary learning network. The subjective intention is introduced by the subjective attention task model, so that the network can fuse images according to the subjective intention. Finally, in order to verify the superiority of our image fusion method, we carried out experiments on the combined vision system image data set, and the infrared and visible image data set for experimental verification. The experimental results demonstrate the superiority of our fusion method over state-of-arts in contextual awareness and robustness. (C) 2020 Elsevier B.V. All rights reserved.

引用

页码：333 / 345

页数：13

共 50 条

[31] Visual question answering with attention transfer and a cross-modal gating mechanism
Li, Wei
Sun, Jianhui
Liu, Ge
Zhao, Linglan
Fang, Xiangzhong
[J]. PATTERN RECOGNITION LETTERS, 2020, 133 (133) : 334 - 340
[32] Temporal Cross-Modal Attention for Audio-Visual Event Localization
Nagasaki, Yoshiki
Hayashi, Masaki
Kaneko, Naoshi
Aoki, Yoshimitsu
[J]. Seimitsu Kogaku Kaishi/Journal of the Japan Society for Precision Engineering, 2022, 88 (03): : 263 - 268
[33] The Neural Correlates of Visual and Auditory Cross-Modal Selective Attention in Aging
Rienacker, Franziska
Van Gerven, Pascal W. M.
Jacobs, Heidi I. L.
Eck, Judith
Van Heugten, Caroline M.
Guerreiro, Maria J. S.
[J]. FRONTIERS IN AGING NEUROSCIENCE, 2020, 12
[34] Visual attention guided image fusion with sparse representation
Yang, Bin
Li, Shutao
[J]. OPTIK, 2014, 125 (17): : 4881 - 4888
[35] Cross-Modal Attention-Guided Convolutional Network for Multi-modal Cardiac Segmentation
Zhou, Ziqi
Guo, Xinna
Yang, Wanqi
Shi, Yinghuan
Zhou, Luping
Wang, Lei
Yang, Ming
[J]. MACHINE LEARNING IN MEDICAL IMAGING (MLMI 2019), 2019, 11861 : 601 - 610
[36] VISUAL ATTENTION IN A VISUAL-HAPTIC, CROSS-MODAL MATCHING TASK IN CHILDREN AND ADULTS
Cote, Carol Ann
[J]. PERCEPTUAL AND MOTOR SKILLS, 2015, 120 (02) : 381 - 396
[37] Cross-Modal Self-Attention Network for Referring Image Segmentation
Ye, Linwei
Rochan, Mrigank
Liu, Zhi
Wang, Yang
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 10494 - 10503
[38] Stacked cross-modal feature consolidation attention networks for image captioning
Pourkeshavarz, Mozhgan
Nabavi, Shahabedin
Moghaddam, Mohsen Ebrahimi
Shamsfard, Mehrnoush
[J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 12209 - 12233
[39] Stacked cross-modal feature consolidation attention networks for image captioning
Mozhgan Pourkeshavarz
Shahabedin Nabavi
Mohsen Ebrahimi Moghaddam
Mehrnoush Shamsfard
[J]. Multimedia Tools and Applications, 2024, 83 : 12209 - 12233
[40] Cross-Modal Attention With Semantic Consistence for Image-Text Matching
Xu, Xing
Wang, Tan
Yang, Yang
Zuo, Lin
Shen, Fumin
Shen, Heng Tao
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2020, 31 (12) : 5412 - 5425

← 1 2 3 4 5 →