Appearance-Based Gaze Estimation Method Using Static Transformer Temporal Differential Network

被引：5

作者：

Li, Yujie ^{[1
]}

Huang, Longzhao ^{[2
]}

Chen, Jiahui ^{[2
]}

Wang, Xiwen ^{[2
]}

Tan, Benying ^{[1
]}

机构：

[1] Univ Key Lab AI Algorithm Engn, Guilin Univ Elect Technol, Guangxi Coll, Sch Artificial Intelligence, Jinji Rd, Guilin 541004, Peoples R China

[2] Guilin Univ Elect Technol, Sch Artificial Intelligence, Jinji Rd, Guilin 541004, Peoples R China

来源：

MATHEMATICS | 2023年 / 11卷 / 03期

基金：

中国国家自然科学基金;

关键词：

gaze estimation; static transformer temporal differential network; static transformer module; temporal differential module; self-attention mechanism;

D O I：

10.3390/math11030686

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Gaze behavior is important and non-invasive human-computer interaction information that plays an important role in many fields-including skills transfer, psychology, and human-computer interaction. Recently, improving the performance of appearance-based gaze estimation, using deep learning techniques, has attracted increasing attention: however, several key problems in these deep-learning-based gaze estimation methods remain. Firstly, the feature fusion stage is not fully considered: existing methods simply concatenate the different obtained features into one feature, without considering their internal relationship. Secondly, dynamic features can be difficult to learn, because of the unstable extraction process of ambiguously defined dynamic features. In this study, we propose a novel method to consider feature fusion and dynamic feature extraction problems. We propose the static transformer module (STM), which uses a multi-head self-attention mechanism to fuse fine-grained eye features and coarse-grained facial features. Additionally, we propose an innovative recurrent neural network (RNN) cell-that is, the temporal differential module (TDM)-which can be used to extract dynamic features. We integrated the STM and the TDM into the static transformer with a temporal differential network (STTDN). We evaluated the STTDN performance, using two publicly available datasets (MPIIFaceGaze and Eyediap), and demonstrated the effectiveness of the STM and the TDM. Our results show that the proposed STTDN outperformed state-of-the-art methods, including that of Eyediap (by 2.9%).

引用

页数：18

共 50 条

[41] BoT2L-Net: Appearance-Based Gaze Estimation Using Bottleneck Transformer Block and Two Identical Losses in Unconstrained Environments
Wang, Xiaohan
Zhou, Jian
Wang, Lin
Yin, Yong
Wang, Yu
Ding, Zhongjun
ELECTRONICS, 2023, 12 (07)
[42] Learning-by-Synthesis for Appearance-based 3D Gaze Estimation
Sugano, Yusuke
Matsushita, Yasuyuki
Sato, Yoichi
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 1821 - 1828
[43] Appearance-Based Gaze Estimation as a Benchmark for Eye Image Data Generation Methods
Katrychuk, Dmytro
Komogortsev, Oleg V.
APPLIED SCIENCES-BASEL, 2024, 14 (20):
[44] Free-Head Appearance-Based Eye Gaze Estimation on Mobile Devices
Jigang, Liu
Lee, Bu Sung
Rajan, Deepu
2019 1ST INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE IN INFORMATION AND COMMUNICATION (ICAIIC 2019), 2019, : 232 - 237
[45] Appearance-based Gaze Estimation with Multi-Modal Convolutional Neural Networks
Wang, Fei
Wang, Yan
Li, Teng
INTERNATIONAL SYMPOSIUM ON ARTIFICIAL INTELLIGENCE AND ROBOTICS 2021, 2021, 11884
[46] TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets
Huang, Qiong
Veeraraghavan, Ashok
Sabharwal, Ashutosh
MACHINE VISION AND APPLICATIONS, 2017, 28 (5-6) : 445 - 461
[47] Appearance-Based Gaze Tracking: A Brief Review
Jiang, Jiaqi
Zhou, Xiaolong
Chan, Sixian
Chen, Shengyong
INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PART VI, 2019, 11745 : 629 - 640
[48] TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets
Qiong Huang
Ashok Veeraraghavan
Ashutosh Sabharwal
Machine Vision and Applications, 2017, 28 : 445 - 461
[49] Appearance-Based Driver 3-D Gaze Estimation Using GRM and Mixed Loss Strategies
Li, Taiguo
Zhang, Yingzhi
Li, Quanqin
IEEE INTERNET OF THINGS JOURNAL, 2024, 11 (23): : 38410 - 38424
[50] Appearance-based Driver's Gaze Mapping Using a Dash Camera
Sonom-Ochir, Ulziibayar
Karungaru, Stephen
Terada, Kenji
Ayush, Altangerel
2022 JOINT 12TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND INTELLIGENT SYSTEMS AND 23RD INTERNATIONAL SYMPOSIUM ON ADVANCED INTELLIGENT SYSTEMS (SCIS&ISIS), 2022,

← 1 2 3 4 5 →