Dual-Path Transformer for 3D Human Pose Estimation

被引:6
|
作者
Zhou, Lu [1 ]
Chen, Yingying [1 ]
Wang, Jinqiao [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Fdn Model Res Ctr, Beijing 100190, Peoples R China
[2] Wuhan AI Res, Wuhan 430073, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518066, Peoples R China
关键词
Transformers; Three-dimensional displays; Pose estimation; Task analysis; Solid modeling; Feature extraction; Benchmark testing; 3D human pose estimation; transformer; motion; distillation;
D O I
10.1109/TCSVT.2023.3318557
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video-based 3D human pose estimation has achieved great progress, however, it is still difficult to learn precise 2D-3D projection under some hard cases. Multi-level human knowledge and motion information serve as two key elements in the field to conquer the challenges caused by various factors, where the former encodes various human structure information spatially and the latter captures the motion change temporally. Inspired by this, we propose a DualFormer (dual-path transformer) network which encodes multiple human contexts and motion detail to perform the spatial-temporal modeling. Firstly, motion information which depicts the movement change of human body is embedded to provide explicit motion prior for the transformer module. Secondly, a dual-path transformer framework is proposed to model long-range dependencies of both joint sequence and limb sequence. Parallel context embedding is performed initially and a cross transformer block is then appended to promote the interaction of the dual paths which improves the feature robustness greatly. Specifically, predictions of multiple levels can be acquired simultaneously. Lastly, we employ the weighted distillation technique to accelerate the convergence of the dual-path framework. We conduct extensive experiments on three different benchmarks, i.e., Human 3.6M, MPI-INF-3DHP and HumanEva-I. We mainly compute the MPJPE, P-MPJPE, PCK and AUC to evaluate the effectiveness of proposed approach and our work achieves competitive results compared with state-of-the-art approaches. Specifically, the MPJPE is reduced to 42.8mm which is 1.5mm lower than PoseFormer on Human3.6M, which proves the efficacy of the proposed approach.
引用
收藏
页码:3260 / 3270
页数:11
相关论文
共 50 条
  • [21] EHGFormer: An efficient hypergraph-injected transformer for 3D human pose estimation
    Zheng, Siyuan
    Cao, Weiqun
    IMAGE AND VISION COMPUTING, 2025, 154
  • [22] STRFormer: Spatial-Temporal-ReTemporal Transformer for 3D human pose estimation
    Liu, Xing
    Tang, Hao
    IMAGE AND VISION COMPUTING, 2023, 140
  • [23] Transformer-based 3D Human pose estimation and action achievement evaluation
    Yang, Aolei
    Zhou, Yinghong
    Yang, Banghua
    Xu, Yulin
    Yi Qi Yi Biao Xue Bao/Chinese Journal of Scientific Instrument, 2024, 45 (04): : 136 - 144
  • [24] Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation
    Li, Wenhao
    Liu, Mengyuan
    Liu, Hong
    Wang, Pichao
    Cai, Jialun
    Sebe, Nicu
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 604 - 613
  • [25] MotionAGFormer: Enhancing 3D Human Pose Estimation with a Transformer-GCNFormer Network
    Mehraban, Soroush
    Adeli, Vida
    Taati, Babak
    Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, : 6905 - 6915
  • [26] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Liu, Shuangjun
    Sehgal, Naveen
    Ostadabbas, Sarah
    APPLIED INTELLIGENCE, 2022, 52 (12) : 14491 - 14506
  • [27] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Shuangjun Liu
    Naveen Sehgal
    Sarah Ostadabbas
    Applied Intelligence, 2022, 52 : 14491 - 14506
  • [28] Robust 3D Human Pose Estimation via Dual Dictionaries Learning
    Ji, Hao
    Su, Fei
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3370 - 3373
  • [29] On the Robustness of 3D Human Pose Estimation
    Chen, Zerui
    Huang, Yan
    Wang, Liang
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 5326 - 5332
  • [30] Overview of 3D Human Pose Estimation
    Lin, Jianchu
    Li, Shuang
    Qin, Hong
    Wang, Hongchang
    Cui, Ning
    Jiang, Qian
    Jian, Haifang
    Wang, Gongming
    CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 2023, 134 (03): : 1621 - 1651