DBMHT: A double-branch multi-hypothesis transformer for 3D human pose estimation in video

被引:0
|
作者
Xiang, Xuezhi [1 ,2 ]
Li, Xiaoheng [1 ]
Bao, Weijie [1 ]
Qiaoa, Yulong [1 ,3 ]
El Saddik, Abdulmotaleb [3 ]
机构
[1] Harbin Engn Univ, Sch Informat & Commun Engn, Harbin 150001, Peoples R China
[2] Minist Ind & Informat Technol, Key Lab Adv Marine Commun & Informat Technol, Harbin 150001, Peoples R China
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
黑龙江省自然科学基金; 中国国家自然科学基金;
关键词
3D human pose estimation; Transformer; Dual-branch; Cross-hypothesis;
D O I
10.1016/j.cviu.2024.104147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The estimation of 3D human poses from monocular videos presents a significant challenge. The existing methods face the problems of deep ambiguity and self-occlusion. To overcome these problems, we propose a Double-Branch Multi-Hypothesis Transformer (DBMHT). In detail, we utilize a Double-Branch architecture to capture temporal and spatial information and generate multiple hypotheses. To merge these hypotheses, we adopt a lightweight module to integrate spatial and temporal representations. The DBMHT can not only capture spatial information from each joint in the human body and temporal information from each frame in the video but also merge multiple hypotheses that have different spatio-temporal information. Comprehensive evaluation on two challenging datasets (i.e. Human3.6M and MPI-INF-3DHP) demonstrates the superior performance of DBMHT, marking it as a robust and efficient approach for accurate 3D HPE in dynamic scenarios. The results show that our model surpasses the state-of-the-art approach by 1.9% MPJPE with ground truth 2D keypoints as input.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Dual-Path Transformer for 3D Human Pose Estimation
    Zhou, Lu
    Chen, Yingying
    Wang, Jinqiao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (05) : 3260 - 3270
  • [22] DGFormer: Dynamic graph transformer for 3D human pose estimation
    Chen Z.
    Dai J.
    Bai J.
    Pan J.
    Pattern Recognition, 2024, 152
  • [23] End-to-end 3D Human Pose Estimation with Transformer
    Zhang, Bowei
    Cui, Peng
    2022 26TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2022, : 4529 - 4536
  • [24] Video-Based 3D Human Pose Estimation Research
    Tao, Siting
    Zhang, Zhi
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 485 - 490
  • [25] Combination of Deep Learner Network and Transformer for 3D Human Pose Estimation
    Tien-Dat Tran
    Xuan-Thuy Vo
    Duy-Linh Nguyen
    Jo, Kang-Hyun
    2022 22ND INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2022), 2022, : 174 - 178
  • [26] Exploiting Temporal Contexts With Strided Transformer for 3D Human Pose Estimation
    Li, Wenhao
    Liu, Hong
    Ding, Runwei
    Liu, Mengyuan
    Wang, Pichao
    Yang, Wenming
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 1282 - 1293
  • [27] Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation Tracking and Forecasting on a Video Snippet
    Zou, Shihao
    Xu, Yuanlu
    Li, Chao
    Ma, Lingni
    Cheng, Li
    Vo, Minh
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4921 - 4933
  • [28] Double chain networks for monocular 3D human pose estimation
    Bai, Guihu
    Luo, Yanmin
    Pan, Xueliang
    Wang, Youjie
    Wang, Jia
    Guo, Jingming
    IMAGE AND VISION COMPUTING, 2022, 123
  • [29] Joint Camera Pose Estimation and 3D Human Pose Estimation in a Multi-camera Setup
    Puwein, Jens
    Ballan, Luca
    Ziegler, Remo
    Pollefeys, Marc
    COMPUTER VISION - ACCV 2014, PT II, 2015, 9004 : 473 - 487
  • [30] Occlusion-Aware Networks for 3D Human Pose Estimation in Video
    Cheng, Yu
    Yang, Bo
    Wang, Bo
    Yan, Wending
    Tan, Robby T.
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 723 - 732