DeepVS2.0: A Saliency-Structured Deep Learning Method for Predicting Dynamic Visual Attention

被引:0
|
作者
Lai Jiang
Mai Xu
Zulin Wang
Leonid Sigal
机构
[1] Beihang University,School of Electronic and Information Engineering
[2] University of British Columbia,Department of Computer Science
来源
关键词
Deep neural networks; Saliency prediction; Convolutional LSTM; Eye-tracking database; Video; Video database;
D O I
暂无
中图分类号
学科分类号
摘要
Deep neural networks (DNNs) have exhibited great success in image saliency prediction. However, few works apply DNNs to predict the saliency of generic videos. In this paper, we propose a novel DNN-based video saliency prediction method, called DeepVS2.0. Specifically, we establish a large-scale eye-tracking database of videos (LEDOV), which provides sufficient data to train the DNN models for predicting video saliency. Through the statistical analysis of LEDOV, we find that human attention is normally attracted by objects, particularly moving objects or the moving parts of objects. Accordingly, we propose an object-to-motion convolutional neural network (OM-CNN) in DeepVS2.0 to learn spatio-temporal features for predicting the intra-frame saliency via exploring the information of both objectness and object motion. We further find from our database that human attention has a temporal correlation with a smooth saliency transition across video frames. Therefore, a saliency-structured convolutional long short-term memory network (SS-ConvLSTM) is developed in DeepVS2.0 to predict inter-frame saliency, using the extracted features of OM-CNN as the input. Moreover, the center-bias dropout and sparsity-weighted loss are embedded in SS-ConvLSTM, to consider the center-bias and sparsity of human attention maps. Finally, the experimental results show that our DeepVS2.0 method advances the state-of-the-art video saliency prediction.
引用
收藏
页码:203 / 224
页数:21
相关论文
共 27 条
  • [1] DeepVS2.0: A Saliency-Structured Deep Learning Method for Predicting Dynamic Visual Attention
    Jiang, Lai
    Xu, Mai
    Wang, Zulin
    Sigal, Leonid
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (01) : 203 - 224
  • [2] Editorial: Human visual saliency and artificial neural attention in deep learning
    Wang, Wenguan
    Cheng, Ming-Ming
    Ling, Haibin
    Porikli, Fatih
    [J]. NEUROCOMPUTING, 2022, 491 : 489 - 491
  • [3] Editorial: Human visual saliency and artificial neural attention in deep learning
    Wang, Wenguan
    Cheng, Ming-Ming
    Ling, Haibin
    Porikli, Fatih
    [J]. Neurocomputing, 2022, 491 : 489 - 491
  • [4] A Visual Saliency Prediction Model Based on Emotional Attention and Deep Learning
    Yan, Fei
    Xiao, Ruoxiu
    Xiao, Peng
    Zhang, Jiaqi
    Chen, Cheng
    Wang, Zhiliang
    [J]. BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 127 : 89 - 90
  • [5] Predicting user visual attention in virtual reality with a deep learning model
    Xiangdong Li
    Yifei Shan
    Wenqian Chen
    Yue Wu
    Praben Hansen
    Simon Perrault
    [J]. Virtual Reality, 2021, 25 : 1123 - 1136
  • [6] A Deep Learning Method for Automatic Visual Attention Detection in Older Drivers
    Chikhaoui, Belkacem
    Ruer, Perrine
    Vallieres, Evelyne F.
    [J]. HOW AI IMPACTS URBAN LIVING AND PUBLIC HEALTH, ICOST 2019, 2019, 11862 : 49 - 59
  • [7] Part recognition method based on visual selective attention mechanism and deep learning
    Zhou, Dan
    Xiao, Nanfeng
    [J]. Journal of Fiber Bioengineering and Informatics, 2015, 8 (04): : 791 - 800
  • [8] MEDIRL: Predicting the Visual Attention of Drivers via Maximum Entropy Deep Inverse Reinforcement Learning
    Baee, Sonia
    Pakdamanian, Erfan
    Kim, Inki
    Feng, Lu
    Ordonez, Vicente
    Barnes, Laura
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 13158 - 13168
  • [9] Compressing Deep Reinforcement Learning Networks With a Dynamic Structured Pruning Method for Autonomous Driving
    Su, Wensheng
    Li, Zhenni
    Xu, Minrui
    Kang, Jiawen
    Niyato, Dusit
    Xie, Shengli
    [J]. IEEE Transactions on Vehicular Technology, 2024, 73 (12) : 18017 - 18030
  • [10] Visual SLAM method for dynamic environment based on deep learning image features
    Liu, Dong
    Yu, Tao
    Cong, Ming
    Du, Yu
    [J]. Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2024, 52 (06): : 156 - 163