Visual saliency prediction using multi-scale attention gated network

被引：0

作者：

Yubao Sun

Mengyang Zhao

Kai Hu

Shaojing Fan

机构：

[1] Nanjing University of Information Science and Technology,The Jiangsu Key Laboratory of Big Data Analysis Technology (B

[2] National University of Singapore,DAT Laboratory), Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology

来源：

Multimedia Systems | 2022年 / 28卷

关键词：

Saliency prediction; Multi-scale attention; Gating fusion;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Predicting human visual attention cannot only increase our understanding of the underlying biological mechanisms, but also bring new insights for other computer vision-related tasks such as autonomous driving and human–computer interaction. Current deep learning-based methods often place emphasis on high-level semantic feature for prediction. However, high-level semantic feature lacks fine-scale spatial information. Ideally, a saliency prediction model should include both spatial and semantic features. In this paper, we propose a multi-scale attention gated network (we refer to as MSAGNet) to fuse semantic features with different spatial resolutions for visual saliency prediction. Specifically, we adopt the high-resolution net (HRNet) as the backbone to extract the multi-scales semantic features. A multi-scale attention gating module is designed to adaptively fuse these multi-scale features in a hierarchical way. Different from the conventional way of feature concatenation from multiple layers or multi-scale inputs, this module calculates a spatial attention map from high-level semantic feature and then fuses it with the low-level spatial feature through gating operation. Through the hierarchical gating fusion, final saliency prediction is achieved at the finest scale. Extensive experimental analyses on three benchmark datasets demonstrate the superior performance of the proposed method.

引用

页码：131 / 139

页数：8

共 50 条

[41] Multi-scale coupled attention for visual object detection
Li, Fei
Yan, Hongping
Shi, Linsu
SCIENTIFIC REPORTS, 2024, 14 (01):
[42] Multi-scale Spectrum Visual Saliency Perception via Hypercomplex DCT
Xiao, Limei
Li, Ce
Hu, Zhijia
Pan, Zhengrong
INTELLIGENT COMPUTING THEORIES AND APPLICATION, ICIC 2016, PT II, 2016, 9772 : 645 - 655
[43] Attention Prediction in Egocentric Video Using Motion and Visual Saliency
Yamada, Kentaro
Sugano, Yusuke
Okabe, Takahiro
Sato, Yoichi
Sugimoto, Akihiro
Hiraki, Kazuo
ADVANCES IN IMAGE AND VIDEO TECHNOLOGY, PT I, 2011, 7087 : 277 - +
[44] Multi-scale fusion visual attention network for facial micro-expression recognition
Pan, Hang
Yang, Hongling
Xie, Lun
Wang, Zhiliang
FRONTIERS IN NEUROSCIENCE, 2023, 17
[45] Multi-scale network with shared cross-attention for audio–visual correlation learning
Jiwei Zhang
Yi Yu
Suhua Tang
Wei Li
Jianming Wu
Neural Computing and Applications, 2023, 35 : 20173 - 20187
[46] Stereoscopic Visual Discomfort Prediction Using Multi-scale DCT Features
Zhou, Yang
Yu, Wanli
Li, Zhu
Yin, Haibing
PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19), 2019, : 184 - 191
[47] Language conditioned multi-scale visual attention networks for visual grounding
Yao, Haibo
Wang, Lipeng
Cai, Chengtao
Wang, Wei
Zhang, Zhi
Shang, Xiaobing
IMAGE AND VISION COMPUTING, 2024, 150
[48] Attention-Based Multi-Scale Prediction Network for Time-Series Data
Junjie Li
Lin Zhu
Yong Zhang
Da Guo
Xingwen Xia
China Communications, 2022, 19 (05) : 286 - 301
[49] Tool Wear Prediction Based on a Multi-Scale Convolutional Neural Network with Attention Fusion
Huang, Qingqing
Wu, Di
Huang, Hao
Zhang, Yan
Han, Yan
INFORMATION, 2022, 13 (10)
[50] MCDAN: A Multi-Scale Context-Enhanced Dynamic Attention Network for Diffusion Prediction
Wang, Xiaowen
Wang, Lanjun
Su, Yuting
Zhang, Yongdong
Liu, An-An
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 7850 - 7862

← 1 2 3 4 5 →