Dual-Path Transformer for 3D Human Pose Estimation

被引:6
|
作者
Zhou, Lu [1 ]
Chen, Yingying [1 ]
Wang, Jinqiao [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, Fdn Model Res Ctr, Beijing 100190, Peoples R China
[2] Wuhan AI Res, Wuhan 430073, Peoples R China
[3] Peng Cheng Lab, Shenzhen 518066, Peoples R China
关键词
Transformers; Three-dimensional displays; Pose estimation; Task analysis; Solid modeling; Feature extraction; Benchmark testing; 3D human pose estimation; transformer; motion; distillation;
D O I
10.1109/TCSVT.2023.3318557
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Video-based 3D human pose estimation has achieved great progress, however, it is still difficult to learn precise 2D-3D projection under some hard cases. Multi-level human knowledge and motion information serve as two key elements in the field to conquer the challenges caused by various factors, where the former encodes various human structure information spatially and the latter captures the motion change temporally. Inspired by this, we propose a DualFormer (dual-path transformer) network which encodes multiple human contexts and motion detail to perform the spatial-temporal modeling. Firstly, motion information which depicts the movement change of human body is embedded to provide explicit motion prior for the transformer module. Secondly, a dual-path transformer framework is proposed to model long-range dependencies of both joint sequence and limb sequence. Parallel context embedding is performed initially and a cross transformer block is then appended to promote the interaction of the dual paths which improves the feature robustness greatly. Specifically, predictions of multiple levels can be acquired simultaneously. Lastly, we employ the weighted distillation technique to accelerate the convergence of the dual-path framework. We conduct extensive experiments on three different benchmarks, i.e., Human 3.6M, MPI-INF-3DHP and HumanEva-I. We mainly compute the MPJPE, P-MPJPE, PCK and AUC to evaluate the effectiveness of proposed approach and our work achieves competitive results compared with state-of-the-art approaches. Specifically, the MPJPE is reduced to 42.8mm which is 1.5mm lower than PoseFormer on Human3.6M, which proves the efficacy of the proposed approach.
引用
收藏
页码:3260 / 3270
页数:11
相关论文
共 50 条
  • [41] Efficient Hierarchical Multi-view Fusion Transformer for 3D Human Pose Estimation
    Zhou, Kangkang
    Zhang, Lijun
    Lu, Feng
    Zhou, Xiang-Dong
    Shi, Yu
    PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 7512 - 7520
  • [42] GraFormer: Graph-oriented Transformer for 3D Pose Estimation
    Zhao, Weixi
    Wang, Weiqiang
    Tian, Yunjie
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 20406 - 20415
  • [43] Temporally Consistent 3D Human Pose Estimation Using Dual 360° Cameras
    Shere, Matthew
    Kim, Hansung
    Hilton, Adrian
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021), 2021, : 81 - 90
  • [44] 3d human pose estimation based on conditional dual-branch diffusion
    Li, Jinghua
    Bai, Zhuowei
    Kong, Dehui
    Chen, Dongpan
    Li, Qianxing
    Yin, Baocai
    MULTIMEDIA SYSTEMS, 2025, 31 (01)
  • [45] A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation
    Peng, Qucheng
    Zheng, Ce
    Chen, Chen
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 2240 - 2249
  • [46] DHRNet: A Dual-path Hierarchical Relation Network for multi-person pose estimation
    Dang, Yonghao
    Yin, Jianqin
    Liu, Liyuan
    Ding, Pengxiang
    Sun, Yuan
    Hu, Yanzhu
    KNOWLEDGE-BASED SYSTEMS, 2024, 300
  • [47] Occlusion Resilient 3D Human Pose Estimation
    Roy, Soumava Kumar
    Badanin, Ilia
    Honari, Sina
    Fua, Pascal
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1198 - 1207
  • [48] A survey on monocular 3D human pose estimation
    Ji X.
    Fang Q.
    Dong J.
    Shuai Q.
    Jiang W.
    Zhou X.
    Virtual Reality and Intelligent Hardware, 2020, 2 (06): : 471 - 500
  • [49] Precise 3D Pose Estimation of Human Faces
    Pernek, Akos
    Hajder, Levente
    PROCEEDINGS OF THE 2014 9TH INTERNATIONAL CONFERENCE ON COMPUTER VISION, THEORY AND APPLICATIONS (VISAPP 2014), VOL 3, 2014, : 618 - 625
  • [50] A survey on deep 3D human pose estimation
    Neupane, Rama Bastola
    Li, Kan
    Boka, Tesfaye Fenta
    ARTIFICIAL INTELLIGENCE REVIEW, 2024, 58 (01)