SMST: A Saliency Map to Scanpath Transformer

被引:0
|
作者
Cao, Xi [1 ]
Ge, Yong-Feng [2 ]
Lin, Ying [3 ]
机构
[1] La Trobe Univ, Dept Comp Sci & Informat Technol, Melbourne 3086, Australia
[2] Victoria Univ, Inst Sustainable Ind & Liveable Cities, Melbourne 3011, Australia
[3] Sun Yat Sen Univ, Dept Psychol, Guangzhou 510006, Peoples R China
来源
关键词
Virtual reality; Scanpath prediction; Evolutionary algorithm;
D O I
10.1007/978-3-031-47843-7_10
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Regarding the virtual reality (VR) environment, scanpath prediction is critical for saving rendering resources and guiding content design due to its omnidirectional characteristic. However, only a few scanpath prediction models have been proposed, compared with the blossoming of saliency map prediction models. Focusing on scanpath prediction in the omnidirectional image area, this paper introduces a novel model that transforms the predicted saliency map into a potential scanpath, named saliency map to scanpath transformer (SMST). The model comprises three parts, filtering, clustering, and routing. Given a predicted saliency map in the VR environment, we first filter out mediocre areas and obtain the filtered saliency map. We then acquire a fixation set by clustering the filtered saliency map based on saliency distribution and taking the centers of the resulting clusters as potential fixation points. The fixation points are then connected, with the weights of the connections defined by the cylindrical distance and the primary visual feature of the original image. Based on the network composed of fully connected fixation points, the scanpath prediction model is converted to a routing problem, which aims to find the optimal route that reaches all the fixation points once and only once. We then propose a scanpath prediction model using an ant colony optimization algorithm. We evaluate the proposed model on multiple kinds of predicted saliency maps, and the prediction performance is promising.
引用
收藏
页码:136 / 149
页数:14
相关论文
共 50 条
  • [1] Scanpath and saliency prediction on 360 degree images
    Assens, Marc
    Giro-i-Nieto, Xavier
    McGuinness, Kevin
    O'Connor, Noel E.
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 69 : 8 - 14
  • [2] Scanpath estimation based on foveated image saliency
    Yixiu Wang
    Bin Wang
    Xiaofeng Wu
    Liming Zhang
    Cognitive Processing, 2017, 18 : 87 - 95
  • [3] Scanpath estimation based on foveated image saliency
    Wang, Yixiu
    Wang, Bin
    Wu, Xiaofeng
    Zhang, Liming
    COGNITIVE PROCESSING, 2017, 18 (01) : 87 - 95
  • [4] Visual Saliency Transformer
    Liu, Nian
    Zhang, Ni
    Wan, Kaiyuan
    Shao, Ling
    Han, Junwei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 4702 - 4712
  • [5] Visual ScanPath Transformer: Guiding Computers to See the World
    Qiu, Mengyu
    Rong, Quan
    Liang, Dong
    Tu, Huawei
    2023 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY, ISMAR, 2023, : 223 - 232
  • [6] Human Visual Scanpath Prediction Based on RGB-D Saliency
    Han, Rui
    Xiao, Shuangjiu
    PROCEEDINGS OF 2018 INTERNATIONAL CONFERENCE ON IMAGE AND GRAPHICS PROCESSING (ICIGP 2018), 2018, : 180 - 184
  • [7] Tracking With Saliency Region Transformer
    Liu, Tianpeng
    Li, Jing
    Wu, Jia
    Zhang, Lefei
    Chang, Jun
    Wan, Jun
    Lian, Lezhi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 285 - 296
  • [8] Video Saliency Forecasting Transformer
    Ma, Cheng
    Sun, Haowen
    Rao, Yongming
    Zhou, Jie
    Lu, Jiwen
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6850 - 6862
  • [9] Toolbox and dataset for the development of saliency and scanpath models for omnidirectional/360° still images
    Gutierrez, Jesus
    David, Erwan
    Rai, Yashas
    Le Callet, Patrick
    SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 69 : 35 - 42
  • [10] Saliency map for object tracking
    Zhang, Dongping
    Li, Wenting
    Sun, Min
    Yu, Haibin
    International Journal of Signal Processing, Image Processing and Pattern Recognition, 2015, 8 (10) : 233 - 240