Exploiting Temporal Information for 3D Human Pose Estimation

被引:205
|
作者
Hossain, Mir Rayat Imtiaz [1 ]
Little, James J. [1 ]
机构
[1] Univ British Columbia, Dept Comp Sci, Vancouver, BC, Canada
来源
关键词
3D human pose; Sequence-to-sequence networks; Layer normalized LSTM; Residual connections;
D O I
10.1007/978-3-030-01249-6_5
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this work, we address the problem of 3D human pose estimation from a sequence of 2D human poses. Although the recent success of deep networks has led many state-of-the-art methods for 3D pose estimation to train deep networks end-to-end to predict from images directly, the top-performing approaches have shown the effectiveness of dividing the task of 3D pose estimation into two steps: using a state-of-the-art 2D pose estimator to estimate the 2D pose from images and then mapping them into 3D space. They also showed that a low-dimensional representation like 2D locations of a set of joints can be discriminative enough to estimate 3D pose with high accuracy. However, estimation of 3D pose for individual frames leads to temporally incoherent estimates due to independent error in each frame causing jitter. Therefore, in this work we utilize the temporal information across a sequence of 2D joint locations to estimate a sequence of 3D poses. We designed a sequence-to-sequence network composed of layer-normalized LSTM units with shortcut connections connecting the input to the output on the decoder side and imposed temporal smoothness constraint during training. We found that the knowledge of temporal consistency improves the best reported result on Human3.6M dataset by approximately 12.2% and helps our network to recover temporally consistent 3D poses over a sequence of images even when the 2D pose detector fails.
引用
收藏
页码:69 / 86
页数:18
相关论文
共 50 条
  • [1] Exploiting Temporal Correlations for 3D Human Pose Estimation
    Wang, Ruibin
    Ying, Xianghua
    Xing, Bowei
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 4527 - 4539
  • [2] Exploiting temporal context for 3D human pose estimation in the wild
    Arnab, Anurag
    Doersch, Carl
    Zisserman, Andrew
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 3390 - 3399
  • [3] Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation
    Li, Wenhao
    Liu, Hong
    Ding, Runwei
    Liu, Mengyuan
    Wang, Pichao
    Yang, Wenming
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1282 - 1293
  • [4] On the Effect of Temporal Information on Monocular 3D Human Pose Estimation
    Brauer, Juergen
    Gong, Wenjuan
    Gonzalez, Jordi
    Arens, Michael
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [5] SkeletonPose: Exploiting human skeleton constraint for 3D human pose estimation
    Chen, Shu
    Xu, Yaxin
    Pu, Zhengdong
    Ouyang, Jianquan
    Zou, Beiji
    [J]. KNOWLEDGE-BASED SYSTEMS, 2022, 255
  • [6] 3D Human Pose Estimation with Spatial and Temporal Transformers
    Zheng, Ce
    Zhu, Sijie
    Mendieta, Matias
    Yang, Taojiannan
    Chen, Chen
    Ding, Zhengming
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11636 - 11645
  • [7] 3D Human Pose Estimation With Spatial Structure Information
    Huang, Xiaoshan
    Huang, Jun
    Tang, Zengming
    [J]. IEEE ACCESS, 2021, 9 : 35947 - 35956
  • [8] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
    Peng, Sha
    Hu, Jiwei
    [J]. Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
  • [9] Semi-Supervised 3D Human Pose Estimation by Jointly Considering Temporal and Multiview Information
    Chu, Wei-Ta
    Pan, Zong-Wei
    [J]. IEEE ACCESS, 2020, 8 : 226974 - 226981
  • [10] Pose Guided Human Motion Transfer by Exploiting 2D and 3D Information
    Zhang, Yahui
    You, Shaodi
    Karaoglu, Sezer
    Gevers, Theo
    [J]. 2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 587 - 595