SpFormer: Spatio-Temporal Modeling for Scanpaths with Transformer

被引：0

作者：

Zhong, Wenqi ^{[1
]}

Yu, Linzhi ^{[1
]}

Xia, Chen ^{[1
]}

Han, Junwei ^{[1
]}

Zhang, Dingwen ^{[1
]}

机构：

[1] Northwestern Polytech Univ, Sch Automat, Xian, Peoples R China

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 7 | 2024年

基金：

中国国家自然科学基金;

关键词：

VISUAL WORKING-MEMORY; EYE-MOVEMENTS; PREDICTION; TASK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Saccadic scanpath, a data representation of human visual behavior, has received broad interest in multiple domains. Scanpath is a complex eye-tracking data modality that includes the sequences of fixation positions and fixation duration, coupled with image information. However, previous methods usually face the spatial misalignment problem of fixation features and loss of critical temporal data (including temporal correlation and fixation duration). In this study, we propose a Transformer-based scanpath model, SpFormer, to alleviate these problems. First, we propose a fixation-centric paradigm to extract the aligned spatial fixation features and tokenize the scanpaths. Then, according to the visual working memory mechanism, we design a local meta attention to reduce the semantic redundancy of fixations and guide the model to focus on the meta scanpath. Finally, we progressively integrate the duration information and fuse it with the fixation features to solve the problem of ambiguous location with the Transformer block increasing. We conduct extensive experiments on four databases under three tasks. The SpFormer establishes new state-of-the-art results in distinct settings, verifying its flexibility and versatility in practical applications. The code can be obtained from https://github.com/wenqizhong/SpFormer.

引用

页码：7605 / 7613

页数：9

共 50 条

[1] Point Spatio-Temporal Transformer Networks for Point Cloud Video Modeling
Fan, Hehe
Yang, Yi
Kankanhalli, Mohan
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2181 - 2192
[2] Diagnostic spatio-temporal transformer with faithful encoding
Labaien, Jokin
Ide, Tsuyoshi
Chen, Pin-Yu
Zugasti, Ekhi
De Carlos, Xabier
[J]. KNOWLEDGE-BASED SYSTEMS, 2023, 274
[3] SoftMatch: Comparing Scanpaths Using Combinatorial Spatio-Temporal Sequences with Fractal Curves
Newport, Robert Ahadizad
Russo, Carlo
Liu, Sidong
Al Suman, Abdulla
Di Ieva, Antonio
[J]. SENSORS, 2022, 22 (19)
[4] Spatio-Temporal Transformer Network for Video Restoration
Kim, Tae Hyun
Sajjadi, Mehdi S. M.
Hirsch, Michael
Schoelkopf, Bernhard
[J]. COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 : 111 - 127
[5] Dynamic point cloud compression with spatio-temporal transformer-style modeling
Zhou, Yichen
Zhang, Xinfeng
Ma, Xiaoqi
Xu, Yingzhan
Zhang, Kai
Zhang, Li
[J]. 2024 DATA COMPRESSION CONFERENCE, DCC, 2024, : 53 - 62
[6] Learning Spatio-Temporal Transformer for Visual Tracking
Yan, Bin
Peng, Houwen
Fu, Jianlong
Wang, Dong
Lu, Huchuan
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10428 - 10437
[7] A Spatio-Temporal Linked Data Representation for Modeling Spatio-Temporal Dialect Data
Scholz, Johannes
Hrastnig, Emanual
Wandl-Vogt, Eveline
[J]. PROCEEDINGS OF WORKSHOPS AND POSTERS AT THE 13TH INTERNATIONAL CONFERENCE ON SPATIAL INFORMATION THEORY (COSIT 2017), 2018, : 275 - 282
[8] Temporal aggregation and spatio-temporal traffic modeling
Percoco, Marco
[J]. JOURNAL OF TRANSPORT GEOGRAPHY, 2015, 46 : 244 - 247
[9] Spatio-temporal Event Modeling and Ranking
Li, Xuefei
Cai, Hongyun
Huang, Zi
Yang, Yang
Zhou, Xiaofang
[J]. WEB INFORMATION SYSTEMS ENGINEERING - WISE 2013, PT II, 2013, 8181 : 361 - 374
[10] Spatio-temporal Modeling of Mosquito Distribution
Dumont, Y.
Dufourd, C.
[J]. APPLICATION OF MATHEMATICS IN TECHNICAL AND NATURAL SCIENCES: 3RD INTERNATIONAL CONFERENCE - AMITANS'11, 2011, 1404

← 1 2 3 4 5 →