Learning to Predict Sequences of Human Visual Fixations

被引:41
|
作者
Jiang, Ming [1 ]
Boix, Xavier [1 ,2 ,3 ]
Roig, Gemma [2 ,3 ]
Xu, Juan [1 ]
Van Gool, Luc [2 ]
Zhao, Qi [1 ]
机构
[1] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117583, Singapore
[2] ETH, Comp Vis Lab, CH-8092 Zurich, Switzerland
[3] MIT, Ist Italiano & Tecnol, Ctr Brains Minds & Machines, Lab Computat & Stat Learning, 77 Massachusetts Ave, Cambridge, MA 02139 USA
关键词
Scanpath prediction; visual saliency prediction; SALIENCY DETECTION; EYE-MOVEMENTS; ATTENTION; FRAMEWORK; SCENE; VIDEO;
D O I
10.1109/TNNLS.2015.2496306
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Most state-of-the-art visual attention models estimate the probability distribution of fixating the eyes in a location of the image, the so-called saliency maps. Yet, these models do not predict the temporal sequence of eye fixations, which may be valuable for better predicting the human eye fixations, as well as for understanding the role of the different cues during visual exploration. In this paper, we present a method for predicting the sequence of human eye fixations, which is learned from the recorded human eye-tracking data. We use least-squares policy iteration (LSPI) to learn a visual exploration policy that mimics the recorded eye-fixation examples. The model uses a different set of parameters for the different stages of visual exploration that capture the importance of the cues during the scanpath. In a series of experiments, we demonstrate the effectiveness of using LSPI for combining multiple cues at different stages of the scanpath. The learned parameters suggest that the low-level and high-level cues (semantics) are similarly important at the first eye fixation of the scanpath, and the contribution of high-level cues keeps increasing during the visual exploration. Results show that our approach obtains the state-of-the-art performances on two challenging data sets: 1) OSIE data set and 2) MIT data set.
引用
收藏
页码:1241 / 1252
页数:12
相关论文
共 50 条
  • [31] Visual search analysis using parametric fixations
    Mohsina Ishrat
    Pawanesh Abrol
    [J]. Multimedia Tools and Applications, 2022, 81 : 10007 - 10022
  • [32] Exploring the nature of visual fixations on other pedestrians
    Fotios, S.
    Uttley, J.
    Fox, S.
    [J]. LIGHTING RESEARCH & TECHNOLOGY, 2018, 50 (04) : 511 - 521
  • [33] Accumulation of visual information across multiple fixations
    Pertzov, Yoni
    Avidan, Galia
    Zohary, Ehud
    [J]. JOURNAL OF VISION, 2009, 9 (10):
  • [34] Leveraging Human Fixations in Sparse Coding: Learning a Discriminative Dictionary for Saliency Prediction
    Jiang, Ming
    Song, Mingli
    Zhao, Qi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 2126 - 2133
  • [35] The application of machine learning to predict genetic relatedness using human mtDNA hypervariable region I sequences
    Govender, Priyanka
    Fashoto, Stephen Gbenga
    Maharaj, Leah
    Adeleke, Matthew A.
    Mbunge, Elliot
    Olamijuwon, Jeremiah
    Akinnuwesi, Boluwaji
    Okpeku, Moses
    [J]. PLOS ONE, 2022, 17 (02):
  • [36] Deep learning helps EEG signals predict different stages of visual processing in the human brain
    Mathur, Nalin
    Gupta, Anubha
    Jaswal, Snehlata
    Verma, Rohit
    [J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2021, 70
  • [37] Objects predict fixations better than early saliency
    Einhaeuser, Wolfgang
    Spain, Merrielle
    Perona, Pietro
    [J]. JOURNAL OF VISION, 2008, 8 (14):
  • [38] Memory and incidental learning for visual frozen noise sequences
    Gold, Jason M.
    Aizenman, Avi
    Bond, Stephanie M.
    Sekuler, Robert
    [J]. VISION RESEARCH, 2014, 99 : 19 - 36
  • [39] VISUAL TOOLS IN TEACHING LEARNING SEQUENCES FOR SCIENCE EDUCATION
    Ferreira, Celeste
    Baptista, Monica
    Arroio, Agnaldo
    [J]. PROBLEMS OF EDUCATION IN THE 21ST CENTURY, 2011, 37 : 48 - 58
  • [40] Learning to recall temporal sequences of visual stimuli in monkey
    Orlov, T
    Yakovlev, V
    Dvorkin, M
    Zohary, E
    Amit, D
    Hochstein, S
    [J]. NEUROSCIENCE LETTERS, 1997, : S38 - S39