Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

被引：0

作者：

Zhou, Kanglei ^{[1
]}

Shum, Hubert P. H. ^{[2
]}

Li, Frederick W. B. ^{[2
]}

Liang, Xiaohui ^{[1
,3
]}

机构：

[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China

[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England

[3] Zhongguancun Lab, Beijing 100081, Peoples R China

来源：

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS | 2024年 / 30卷 / 10期

基金：

英国工程与自然科学研究理事会; 中国国家自然科学基金;

关键词：

Graph convolutional network; hand motion denoising; hand motion prediction; multi-task learning; GENERATIVE ADVERSARIAL NETWORK;

D O I：

10.1109/TVCG.2023.3337868

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.

引用

页码：6754 / 6769

页数：16

共 50 条

[1] STGAE: Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising
Zhou, Kanglei
Cheng, Zhiyuan
Shum, Hubert P. H.
Li, Frederick W. B.
Liang, Xiaohui
2021 IEEE INTERNATIONAL SYMPOSIUM ON MIXED AND AUGMENTED REALITY (ISMAR 2021), 2021, : 41 - 49
[2] GRAPH AUTO-ENCODER FOR GRAPH SIGNAL DENOISING
Tien Huu Do
Duc Minh Nguyen
Deligiannis, Nikos
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 3322 - 3326
[3] Spatial Domain Identification Based on Graph Attention Denoising Auto-encoder
Gao, Yue
Zhang, Dai-Jun
Jiao, Cui-Na
Gao, Ying-Lian
Liu, Jin-Xing
ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT III, 2023, 14088 : 359 - 367
[4] Representation learning with deep sparse auto-encoder for multi-task learning
Zhu, Yi
Wu, Xindong
Qiang, Jipeng
Hu, Xuegang
Zhang, Yuhong
Li, Peipei
PATTERN RECOGNITION, 2022, 129
[5] A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion
Jiang, Junkun
Chen, Jie
Guo, Yike
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 5123 - 5131
[6] Multi-Task Spatial-Temporal Graph Attention Network for Taxi Demand Prediction
Wu, Mingming
Zhu, Chaochao
Chen, Lianliang
2020 5TH INTERNATIONAL CONFERENCE ON MATHEMATICS AND ARTIFICIAL INTELLIGENCE (ICMAI 2020), 2020, : 224 - 228
[7] Deep Auto-encoder Based Multi-task Learning Using Probabilistic Transcriptions
Das, Amit
Hasegawa-Johnson, Mark
Vesely, Karel
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 2073 - 2077
[8] Vision-Based Fall Detection With Multi-Task Hourglass Convolutional Auto-Encoder
Cai, Xi
Li, Suyuan
Liu, Xinyue
Han, Guang
IEEE ACCESS, 2020, 8 : 44493 - 44502
[9] A Stacked Multi-Granularity Convolution Denoising Auto-Encoder
Yang, Yun
Cao, Lijuan
Liu, Qing
Yang, Po
IEEE ACCESS, 2019, 7 : 83888 - 83899
[10] Two-Stream Spatial-Temporal Auto-Encoder With Adversarial Training for Video Anomaly Detection
Guo, Biao
Liu, Mingrui
He, Qian
Jiang, Ming
IEEE ACCESS, 2024, 12 : 125881 - 125889

← 1 2 3 4 5 →