Visual saliency prediction using multi-scale attention gated network

被引:0
|
作者
Yubao Sun
Mengyang Zhao
Kai Hu
Shaojing Fan
机构
[1] Nanjing University of Information Science and Technology,The Jiangsu Key Laboratory of Big Data Analysis Technology (B
[2] National University of Singapore,DAT Laboratory), Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology
来源
Multimedia Systems | 2022年 / 28卷
关键词
Saliency prediction; Multi-scale attention; Gating fusion;
D O I
暂无
中图分类号
学科分类号
摘要
Predicting human visual attention cannot only increase our understanding of the underlying biological mechanisms, but also bring new insights for other computer vision-related tasks such as autonomous driving and human–computer interaction. Current deep learning-based methods often place emphasis on high-level semantic feature for prediction. However, high-level semantic feature lacks fine-scale spatial information. Ideally, a saliency prediction model should include both spatial and semantic features. In this paper, we propose a multi-scale attention gated network (we refer to as MSAGNet) to fuse semantic features with different spatial resolutions for visual saliency prediction. Specifically, we adopt the high-resolution net (HRNet) as the backbone to extract the multi-scales semantic features. A multi-scale attention gating module is designed to adaptively fuse these multi-scale features in a hierarchical way. Different from the conventional way of feature concatenation from multiple layers or multi-scale inputs, this module calculates a spatial attention map from high-level semantic feature and then fuses it with the low-level spatial feature through gating operation. Through the hierarchical gating fusion, final saliency prediction is achieved at the finest scale. Extensive experimental analyses on three benchmark datasets demonstrate the superior performance of the proposed method.
引用
收藏
页码:131 / 139
页数:8
相关论文
共 50 条
  • [1] Visual saliency prediction using multi-scale attention gated network
    Sun, Yubao
    Zhao, Mengyang
    Hu, Kai
    Fan, Shaojing
    MULTIMEDIA SYSTEMS, 2022, 28 (01) : 131 - 139
  • [2] MULTI-SCALE VISUAL ATTENTION & SALIENCY MODELLING WITH DECISION THEORY
    Anh Cat Le Ngo
    Ang, Li-Minn
    Qiu, Guoping
    Seng, Kah Phooi
    2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 216 - 220
  • [3] DeepAttent: Saliency Prediction with Deep Multi-scale Residual Network
    Dwivedi, Kshitij
    Singh, Nitin
    Shanmugham, Sabari R.
    Kumar, Manoj
    PROCEEDINGS OF 3RD INTERNATIONAL CONFERENCE ON COMPUTER VISION AND IMAGE PROCESSING, CVIP 2018, VOL 2, 2020, 1024 : 65 - 73
  • [4] Multi-Scale Spatiotemporal Feature Fusion Network for Video Saliency Prediction
    Zhang, Yunzuo
    Zhang, Tian
    Wu, Cunyu
    Tao, Ran
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4183 - 4193
  • [5] MGSNet: A multi-scale and gated spatial attention network for crowd counting
    Ying Shi
    Jun Sang
    Zhongyuan Wu
    Fusen Wang
    Xinyue Liu
    Xiaofeng Xia
    Nong Sang
    Applied Intelligence, 2022, 52 : 15436 - 15446
  • [6] MGSNet: A multi-scale and gated spatial attention network for crowd counting
    Shi, Ying
    Sang, Jun
    Wu, Zhongyuan
    Wang, Fusen
    Liu, Xinyue
    Xia, Xiaofeng
    Sang, Nong
    APPLIED INTELLIGENCE, 2022, 52 (13) : 15436 - 15446
  • [7] Depth-induced Multi-scale Recurrent Attention Network for Saliency Detection
    Piao, Yongri
    Ji, Wei
    Li, Jingjing
    Zhang, Miao
    Lu, Huchuan
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7253 - 7262
  • [8] LIGHTER AND FASTER CROSS-CONCATENATED MULTI-SCALE RESIDUAL BLOCK BASED NETWORK FOR VISUAL SALIENCY PREDICTION
    Malladi, Sai Phani Kumar
    Mukhopadhyay, Jayanta
    Larabi, Chaker
    Chaudhury, Santanu
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2503 - 2507
  • [9] MULTI-SCALE TRANSFORMER NETWORK FOR SALIENCY PREDICTION ON 360-DEGREE IMAGES
    Lin, Xu
    Qing, Chunmei
    Tan, Junpeng
    Xu, Xiangmin
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 1700 - 1704
  • [10] Multi-Scale Attention and Encoder-Decoder Network for Video Saliency Object Detection
    Hongbo Bi
    Huihui Zhu
    Lina Yang
    Ranwan Wu
    Pattern Recognition and Image Analysis, 2022, 32 : 340 - 350