Learning Sequential Contexts using Transformer for 3D Hand Pose Estimation

被引:1
|
作者
Khaleghi, Leyla [1 ,2 ]
Marshall, Joshua [1 ,2 ]
Etemad, Ali [1 ,2 ]
机构
[1] Queens Univ Kingston, Dept ECE, Kingston, ON, Canada
[2] Queens Univ Kingston, Ingenu Labs, Res Inst, Kingston, ON, Canada
关键词
D O I
10.1109/ICPR56361.2022.9955633
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D hand pose estimation (HPE) is the process of locating the joints of the hand in 3D from any visual input. HPE has recently received an increased amount of attention due to its key role in a variety of human-computer interaction applications. Recent HPE methods have demonstrated the advantages of employing videos or multi-view images, allowing for more robust HPE systems. Accordingly, in this study, we propose a new method to perform Sequential learning with Transformer for Hand Pose (SeTHPose) estimation. Our SeTHPose pipeline begins by extracting visual embeddings from individual hand images. We then use a transformer encoder to learn the sequential context along time or viewing angles and generate accurate 21) hand joint locations. Then, a graph convolutional neural network with a U-Net configuration is used to convert the 2D hand joint locations to 3D poses. Our experiments show that SeTHPose performs well on both hand sequence varieties, temporal and angular. Also, SeTHPose outperforms other methods in the lield to achieve new state-of-the-art results on two public available sequential datasets, STB and MuViHand.
引用
收藏
页码:535 / 541
页数:7
相关论文
共 50 条
  • [41] ASCS-Reinforcement Learning: A Cascaded Framework for Accurate 3D Hand Pose Estimation
    Chen, Mingqi
    Shuang, Feng
    Li, Shaodong
    Liu, Xi
    PROCEEDINGS OF THE 2023 ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2023, 2023, : 335 - 342
  • [42] A Comprehensive Study on Deep Learning-Based 3D Hand Pose Estimation Methods
    Chatzis, Theocharis
    Stergioulas, Andreas
    Konstantinidis, Dimitrios
    Dimitropoulos, Kosmas
    Daras, Petros
    APPLIED SCIENCES-BASEL, 2020, 10 (19): : 1 - 27
  • [43] Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
    Wen, Yilin
    Pan, Hao
    Yang, Lei
    Pan, Jia
    Komura, Taku
    Wang, Wenping
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21243 - 21253
  • [44] End-to-end 3D Human Pose Estimation with Transformer
    Zhang, Bowei
    Cui, Peng
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4529 - 4536
  • [45] GraFormer: Graph-oriented Transformer for 3D Pose Estimation
    Zhao, Weixi
    Wang, Weiqiang
    Tian, Yunjie
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20406 - 20415
  • [46] Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation
    Guo, Fangtai
    He, Zaixing
    Zhang, Shuyou
    Zhao, Xinyue
    Tan, Jianrong
    IEEE ACCESS, 2020, 8 : 18258 - 18269
  • [47] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
    Peng, Sha
    Hu, Jiwei
    Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
  • [48] 3D Hand Shape and Pose Estimation based on 2D Hand Keypoints
    Drosakis, Drosakis
    Argyros, Antonis
    PROCEEDINGS OF THE 16TH ACM INTERNATIONAL CONFERENCE ON PERVASIVE TECHNOLOGIES RELATED TO ASSISTIVE ENVIRONMENTS, PETRA 2023, 2023, : 148 - 153
  • [49] Accurate 3D Hand Pose Estimation for Whole-Body 3D Human Mesh Estimation
    Moon, Gyeongsik
    Choi, Hongsuk
    Lee, Kyoung Mu
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2307 - 2316
  • [50] Generalizable Sequential Camera Pose Learning Using Surf Enhanced 3D CNN
    Elmoogy, Ahmed
    Dong, Xiaodai
    Lu, Tao
    Westendorp, Robert
    Reddy, Kishore
    2020 IEEE 92ND VEHICULAR TECHNOLOGY CONFERENCE (VTC2020-FALL), 2020,