DBMHT: A double-branch multi-hypothesis transformer for 3D human pose estimation in video

被引:0
|
作者
Xiang, Xuezhi [1 ,2 ]
Li, Xiaoheng [1 ]
Bao, Weijie [1 ]
Qiaoa, Yulong [1 ,3 ]
El Saddik, Abdulmotaleb [3 ]
机构
[1] Harbin Engn Univ, Sch Informat & Commun Engn, Harbin 150001, Peoples R China
[2] Minist Ind & Informat Technol, Key Lab Adv Marine Commun & Informat Technol, Harbin 150001, Peoples R China
[3] Univ Ottawa, Sch Elect Engn & Comp Sci, Ottawa, ON K1N 6N5, Canada
基金
黑龙江省自然科学基金; 中国国家自然科学基金;
关键词
3D human pose estimation; Transformer; Dual-branch; Cross-hypothesis;
D O I
10.1016/j.cviu.2024.104147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The estimation of 3D human poses from monocular videos presents a significant challenge. The existing methods face the problems of deep ambiguity and self-occlusion. To overcome these problems, we propose a Double-Branch Multi-Hypothesis Transformer (DBMHT). In detail, we utilize a Double-Branch architecture to capture temporal and spatial information and generate multiple hypotheses. To merge these hypotheses, we adopt a lightweight module to integrate spatial and temporal representations. The DBMHT can not only capture spatial information from each joint in the human body and temporal information from each frame in the video but also merge multiple hypotheses that have different spatio-temporal information. Comprehensive evaluation on two challenging datasets (i.e. Human3.6M and MPI-INF-3DHP) demonstrates the superior performance of DBMHT, marking it as a robust and efficient approach for accurate 3D HPE in dynamic scenarios. The results show that our model surpasses the state-of-the-art approach by 1.9% MPJPE with ground truth 2D keypoints as input.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Self-supervised 3D human pose estimation from video
    Gholami, Mohsen
    Rezaei, Ahmad
    Rhodin, Helge
    Ward, Rabab
    Wang, Z. Jane
    NEUROCOMPUTING, 2022, 488 : 97 - 106
  • [32] MULTI HYBRID EXTRACTOR NETWORK FOR 3D HUMAN POSE ESTIMATION
    Yuan, Zhixiang
    Zhang, Xitie
    Wu, Suping
    Zhang, Boyang
    Peng, Yuxin
    Wang, Bing
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3170 - 3174
  • [33] Shape Estimation of a 3D Printed Soft Sensor Using Multi-Hypothesis Extended Kalman Filter
    Tan, Kaige
    Ji, Qinglei
    Feng, Lei
    Torngren, Martin
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2022, 7 (03) : 8383 - 8390
  • [34] SCGFormer: Semantic Chebyshev Graph Convolution Transformer for 3D Human Pose Estimation
    Liang, Jiayao
    Yin, Mengxiao
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [35] VTP: volumetric transformer for multi-view multi-person 3D pose estimation
    Chen, Yuxing
    Gu, Renshu
    Huang, Ouhan
    Jia, Gangyong
    APPLIED INTELLIGENCE, 2023, 53 (22) : 26568 - 26579
  • [36] GRAPHRPE: RELATIVE POSITION ENCODING GRAPH TRANSFORMER FOR 3D HUMAN POSE ESTIMATION
    Zou, Junjie
    Shao, Ming
    Xia, Siyu
    2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 895 - 899
  • [37] VTP: volumetric transformer for multi-view multi-person 3D pose estimation
    Yuxing Chen
    Renshu Gu
    Ouhan Huang
    Gangyong Jia
    Applied Intelligence, 2023, 53 : 26568 - 26579
  • [38] Split-and-recombine and vision transformer based 3D human pose estimation
    Lu, Xinyi
    Xu, Fan
    Hu, Shuiyi
    Yu, Tianqi
    Hu, Jianling
    Signal, Image and Video Processing, 2025, 19 (01)
  • [39] HDFormer: High-order Directed Transformer for 3D Human Pose Estimation
    Chen, Hanyuan
    He, Jun-Yan
    Xiang, Wangmeng
    Cheng, Zhi-Qi
    Liu, Wei
    Liu, Hanbing
    Luo, Bin
    Geng, Yifeng
    Xie, Xuansong
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 581 - 589
  • [40] RETRACTED: 3D Human Pose Estimation Based on Transformer Algorithm (Retracted Article)
    Chen, Guowei
    MOBILE INFORMATION SYSTEMS, 2022, 2022