SpFormer: Spatio-Temporal Modeling for Scanpaths with Transformer

被引:0
|
作者
Zhong, Wenqi [1 ]
Yu, Linzhi [1 ]
Xia, Chen [1 ]
Han, Junwei [1 ]
Zhang, Dingwen [1 ]
机构
[1] Northwestern Polytech Univ, Sch Automat, Xian, Peoples R China
基金
中国国家自然科学基金;
关键词
VISUAL WORKING-MEMORY; EYE-MOVEMENTS; PREDICTION; TASK;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Saccadic scanpath, a data representation of human visual behavior, has received broad interest in multiple domains. Scanpath is a complex eye-tracking data modality that includes the sequences of fixation positions and fixation duration, coupled with image information. However, previous methods usually face the spatial misalignment problem of fixation features and loss of critical temporal data (including temporal correlation and fixation duration). In this study, we propose a Transformer-based scanpath model, SpFormer, to alleviate these problems. First, we propose a fixation-centric paradigm to extract the aligned spatial fixation features and tokenize the scanpaths. Then, according to the visual working memory mechanism, we design a local meta attention to reduce the semantic redundancy of fixations and guide the model to focus on the meta scanpath. Finally, we progressively integrate the duration information and fuse it with the fixation features to solve the problem of ambiguous location with the Transformer block increasing. We conduct extensive experiments on four databases under three tasks. The SpFormer establishes new state-of-the-art results in distinct settings, verifying its flexibility and versatility in practical applications. The code can be obtained from https://github.com/wenqizhong/SpFormer.
引用
收藏
页码:7605 / 7613
页数:9
相关论文
共 50 条
  • [1] Point Spatio-Temporal Transformer Networks for Point Cloud Video Modeling
    Fan, Hehe
    Yang, Yi
    Kankanhalli, Mohan
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2181 - 2192
  • [2] Diagnostic spatio-temporal transformer with faithful encoding
    Labaien, Jokin
    Ide, Tsuyoshi
    Chen, Pin-Yu
    Zugasti, Ekhi
    De Carlos, Xabier
    [J]. KNOWLEDGE-BASED SYSTEMS, 2023, 274
  • [3] SoftMatch: Comparing Scanpaths Using Combinatorial Spatio-Temporal Sequences with Fractal Curves
    Newport, Robert Ahadizad
    Russo, Carlo
    Liu, Sidong
    Al Suman, Abdulla
    Di Ieva, Antonio
    [J]. SENSORS, 2022, 22 (19)
  • [4] Spatio-Temporal Transformer Network for Video Restoration
    Kim, Tae Hyun
    Sajjadi, Mehdi S. M.
    Hirsch, Michael
    Schoelkopf, Bernhard
    [J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 111 - 127
  • [5] Dynamic point cloud compression with spatio-temporal transformer-style modeling
    Zhou, Yichen
    Zhang, Xinfeng
    Ma, Xiaoqi
    Xu, Yingzhan
    Zhang, Kai
    Zhang, Li
    [J]. 2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 53 - 62
  • [6] Learning Spatio-Temporal Transformer for Visual Tracking
    Yan, Bin
    Peng, Houwen
    Fu, Jianlong
    Wang, Dong
    Lu, Huchuan
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10428 - 10437
  • [7] A Spatio-Temporal Linked Data Representation for Modeling Spatio-Temporal Dialect Data
    Scholz, Johannes
    Hrastnig, Emanual
    Wandl-Vogt, Eveline
    [J]. PROCEEDINGS OF WORKSHOPS AND POSTERS AT THE 13TH INTERNATIONAL CONFERENCE ON SPATIAL INFORMATION THEORY (COSIT 2017), 2018, : 275 - 282
  • [8] Temporal aggregation and spatio-temporal traffic modeling
    Percoco, Marco
    [J]. JOURNAL OF TRANSPORT GEOGRAPHY, 2015, 46 : 244 - 247
  • [9] Spatio-temporal Event Modeling and Ranking
    Li, Xuefei
    Cai, Hongyun
    Huang, Zi
    Yang, Yang
    Zhou, Xiaofang
    [J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT II, 2013, 8181 : 361 - 374
  • [10] Spatio-temporal Modeling of Mosquito Distribution
    Dumont, Y.
    Dufourd, C.
    [J]. APPLICATION OF MATHEMATICS IN TECHNICAL AND NATURAL SCIENCES: 3RD INTERNATIONAL CONFERENCE - AMITANS'11, 2011, 1404