Multi-Task Spatial-Temporal Graph Auto-Encoder for Hand Motion Denoising

被引:0
|
作者
Zhou, Kanglei [1 ]
Shum, Hubert P. H. [2 ]
Li, Frederick W. B. [2 ]
Liang, Xiaohui [1 ,3 ]
机构
[1] Beihang Univ, State Key Lab Virtual Real Technol & Syst, Beijing 100191, Peoples R China
[2] Univ Durham, Dept Comp Sci, Durham DH1 3LE, England
[3] Zhongguancun Lab, Beijing 100081, Peoples R China
基金
英国工程与自然科学研究理事会; 中国国家自然科学基金;
关键词
Graph convolutional network; hand motion denoising; hand motion prediction; multi-task learning; GENERATIVE ADVERSARIAL NETWORK;
D O I
10.1109/TVCG.2023.3337868
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In many human-computer interaction applications, fast and accurate hand tracking is necessary for an immersive experience. However, raw hand motion data can be flawed due to issues such as joint occlusions and high-frequency noise, hindering the interaction. Using only current motion for interaction can lead to lag, so predicting future movement is crucial for a faster response. Our solution is the Multi-task Spatial-Temporal Graph Auto-Encoder (Multi-STGAE), a model that accurately denoises and predicts hand motion by exploiting the inter-dependency of both tasks. The model ensures a stable and accurate prediction through denoising while maintaining motion dynamics to avoid over-smoothed motion and alleviate time delays through prediction. A gate mechanism is integrated to prevent negative transfer between tasks and further boost multi-task performance. Multi-STGAE also includes a spatial-temporal graph autoencoder block, which models hand structures and motion coherence through graph convolutional networks, reducing noise while preserving hand physiology. Additionally, we design a novel hand partition strategy and hand bone loss to improve natural hand motion generation. We validate the effectiveness of our proposed method by contributing two large-scale datasets with a data corruption algorithm based on two benchmark datasets. To evaluate the natural characteristics of the denoised and predicted hand motion, we propose two structural metrics. Experimental results show that our method outperforms the state-of-the-art, showcasing how the multi-task framework enables mutual benefits between denoising and prediction.
引用
收藏
页码:6754 / 6769
页数:16
相关论文
共 50 条
  • [41] Spatial-temporal multi -task learning for salient region detection
    Chen, Zhe
    Wang, Ruili
    Yu, Ming
    Gao, Hongmin
    Li, Qi
    Wang, Huibin
    PATTERN RECOGNITION LETTERS, 2020, 132 (132) : 76 - 83
  • [42] Adaptive Spatial-Temporal Graph-Mixer for Human Motion Prediction
    Yang, Shubo
    Li, Haolun
    Pun, Chi-Man
    Du, Chun
    Gao, Hao
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1244 - 1248
  • [43] Deep graph gated recurrent unit network-based spatial-temporal multi-task learning for intelligent information fusion of multiple sites with application in short-term spatial-temporal probabilistic forecast of photovoltaic power
    Bai, Mingliang
    Zhou, Zhihao
    Li, Jingjing
    Chen, Yunxiao
    Liu, Jinfu
    Zhao, Xinyu
    Yu, Daren
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 240
  • [44] Deep flight track clustering based on spatial-temporal distance and denoising auto-encoding
    Liu, Guoqian
    Fan, Yuqi
    Zhang, Jianjun
    Wen, Pengfei
    Lyu, Zengwei
    Yuan, Xiaohui
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 198
  • [45] Multi-sensor signal fusion for tool wear condition monitoring using denoising transformer auto-encoder Resnet
    Wang, Hui
    Wang, Shuhui
    Sun, Weifang
    Xiang, Jiawei
    JOURNAL OF MANUFACTURING PROCESSES, 2024, 124 : 1054 - 1064
  • [46] Multi-Resolutional Collaborative Heterogeneous Graph Convolutional Auto-Encoder for Drug-Target Interaction Prediction
    Jin, Xu
    Liu, MingMing
    Wang, Lin
    He, WenQian
    Huang, YaLou
    Xie, MaoQiang
    2020 IEEE INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND BIOMEDICINE, 2020, : 150 - 153
  • [47] Unified Knowledge-Guided Molecular Graph Encoder with multimodal fusion and multi-task learning
    Chen, Mukun
    Gong, Xiuwen
    Pan, Shirui
    Wu, Jia
    Lin, Fu
    Du, Bo
    Hu, Wenbin
    NEURAL NETWORKS, 2025, 184
  • [48] ADST: Forecasting Metro Flow Using Attention-Based Deep Spatial-Temporal Networks with Multi-Task Learning
    Jia, Hongwei
    Luo, Haiyong
    Wang, Hao
    Zhao, Fang
    Ke, Qixue
    Wu, Mingyao
    Zhao, Yunyun
    SENSORS, 2020, 20 (16) : 1 - 23
  • [49] MT-FiST: A Multi-Task Fine-Grained Spatial-Temporal Framework for Surgical Action Triplet Recognition
    Li, Yuchong
    Xia, Tong
    Luo, Huoling
    He, Baochun
    Jia, Fucang
    IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS, 2023, 27 (10) : 4983 - 4994
  • [50] STAGED: A Spatial-Temporal Aware Graph Encoder-Decoder for Fault Diagnosis in Industrial Processes
    Li, Shizhong
    Meng, Wenchao
    He, Shibo
    Bi, Jichao
    Liu, Guanglun
    IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (02) : 1742 - 1752