Learning Sequential Contexts using Transformer for 3D Hand Pose Estimation

被引:1
|
作者
Khaleghi, Leyla [1 ,2 ]
Marshall, Joshua [1 ,2 ]
Etemad, Ali [1 ,2 ]
机构
[1] Queens Univ Kingston, Dept ECE, Kingston, ON, Canada
[2] Queens Univ Kingston, Ingenu Labs, Res Inst, Kingston, ON, Canada
关键词
D O I
10.1109/ICPR56361.2022.9955633
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D hand pose estimation (HPE) is the process of locating the joints of the hand in 3D from any visual input. HPE has recently received an increased amount of attention due to its key role in a variety of human-computer interaction applications. Recent HPE methods have demonstrated the advantages of employing videos or multi-view images, allowing for more robust HPE systems. Accordingly, in this study, we propose a new method to perform Sequential learning with Transformer for Hand Pose (SeTHPose) estimation. Our SeTHPose pipeline begins by extracting visual embeddings from individual hand images. We then use a transformer encoder to learn the sequential context along time or viewing angles and generate accurate 21) hand joint locations. Then, a graph convolutional neural network with a U-Net configuration is used to convert the 2D hand joint locations to 3D poses. Our experiments show that SeTHPose performs well on both hand sequence varieties, temporal and angular. Also, SeTHPose outperforms other methods in the lield to achieve new state-of-the-art results on two public available sequential datasets, STB and MuViHand.
引用
收藏
页码:535 / 541
页数:7
相关论文
共 50 条
  • [21] Learning a deep network with spherical part model for 3D hand pose estimation
    Chen, Tzu-Yang
    Ting, Pai-Wen
    Wu, Min-Yu
    Fu, Li-Chen
    PATTERN RECOGNITION, 2018, 80 : 1 - 20
  • [22] 3D hand pose and mesh estimation via a generic Topology-aware Transformer model
    Yu, Shaoqi
    Wang, Yintong
    Chen, Lili
    Zhang, Xiaolin
    Li, Jiamao
    FRONTIERS IN NEUROROBOTICS, 2024, 18
  • [23] 3D Object Pose Estimation Using Viewpoint Generative Learning
    Thachasongtham, Dissaphong
    Yoshida, Takumi
    de Sorbier, Francois
    Saito, Hideo
    IMAGE ANALYSIS, SCIA 2013: 18TH SCANDINAVIAN CONFERENCE, 2013, 7944 : 512 - 521
  • [24] SEMI-SUPERVISED 3D HAND-OBJECT POSE ESTIMATION VIA POSE DICTIONARY LEARNING
    Cheng, Zida
    Chen, Siheng
    Zhang, Ya
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3632 - 3636
  • [25] Aligning Latent Spaces for 3D Hand Pose Estimation
    Yang, Linlin
    Li, Shile
    Lee, Dongheui
    Yao, Angela
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2335 - 2343
  • [26] PEAN: 3D Hand Pose Estimation Adversarial Network
    Sun, Linhui
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 1251 - 1258
  • [27] Residual Attention Regression for 3D Hand Pose Estimation
    Li, Jing
    Zhang, Long
    Ju, Zhaojie
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2019, PT IV, 2019, 11743 : 605 - 614
  • [28] CASCADED POINT NETWORK FOR 3D HAND POSE ESTIMATION
    Dou, Yikun
    Wang, Xuguang
    Zhu, Yuying
    Deng, Xiaoming
    Ma, Cuixia
    Chang, Liang
    Wang, Hongan
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 1982 - 1986
  • [29] 3D Hand Pose Estimation on Conventional Capacitive Touchscreens
    Choi, Frederick
    Mayer, Sven
    Harrison, Chris
    PROCEEDINGS OF 23RD ACM INTERNATIONAL CONFERENCE ON MOBILE HUMAN-COMPUTER INTERACTION (MOBILEHCI 2021): MOBILE APART, MOBILE TOGETHER, 2021,
  • [30] Database indexing methods for 3D hand pose estimation
    Athitsos, V
    Sclaroff, S
    GESTURE-BASED COMMUNICATION IN HUMAN-COMPUTER INTERACTION, 2003, 2915 : 288 - 299