Spatial-Temporal-Geometric Graph Convolutional Network for 3-D Human Pose Estimation From Multiview Video

被引:0
|
作者
Dong, Kaiwen [1 ]
Zhou, Yu [2 ]
Riou, Kevin [3 ]
Yun, Xiao [2 ]
Sun, Yanjing [2 ]
Subrin, Kevin [3 ]
Le Callet, Patrick [3 ,4 ]
机构
[1] China Univ Min & Technol, IOT Percept Mine Res Ctr, Xuzhou 221008, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China
[3] Nantes Univ, Ecole Cent Nantes, CNRS, UMR 6004,LS2N, F-44306 Nantes, France
[4] Inst Univ France IUF, F-75005 Paris, France
基金
中国国家自然科学基金;
关键词
Three-dimensional displays; Cameras; Pose estimation; Convolution; Computational complexity; Graph convolutional networks; Computational modeling; Training; Sun; Solid modeling; 3-D pose estimation; confidence-aware attention (CAA); graph convolutional network (GCN); multiview;
D O I
10.1109/TIM.2025.3551025
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The multiview 3-D human pose estimation (HPE) effectively addresses challenges, such as depth ambiguity and occlusion faced by monocular methods through the complementing of geometric information from multiple views. However, existing multiview methods often necessitate well-calibrated camera parameters or rely on complex parametric models. These requirements can result in inaccuracies when camera placement is perturbed and can negatively impact the deployability. This article proposes a lightweight approach that synergistically models geometric information with spatial-temporal information without relying on camera parameters, named spatial-temporal-geometric graph convolutional network (STG-GCN). We leverage the inherent connections in multiview sequences of 2-D poses, representing them as a spatial-temporal-geometric graph (STG-Graph), which allows for the simultaneous encoding of spatial-temporal-geometric relations across various joints, consecutive frames, and multiple views. Using a unified graph to model all features, this approach reduces the parameter explosion in existing methods, caused by separate modules extracting spatial, temporal, and view axis features. Building upon the STG-Graph, an adaptive confidence-aware graph convolution (ACA-GraphConv) is proposed to mitigate the impact of unreliable 2-D poses predicted by 2-D pose estimators. This is achieved by leveraging corresponding confidence scores to adjust the graph convolution accordingly. Experimental results on two public datasets demonstrate that our STG-GCN achieves performance comparable to state-of-the-art approaches while significantly reducing parameter volume. Ablation studies also illustrate the effectiveness of our ACA-GraphConv in both monocular and multiview scenarios.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Human Pose Estimation Based on a Spatial Temporal Graph Convolutional Network
    Wu, Meng
    Shi, Pudong
    APPLIED SCIENCES-BASEL, 2023, 13 (05):
  • [2] A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video
    The Biomimetic and Intelligent Robotics Lab , School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou
    510006, China
    不详
    不详
    Proc IEEE Int Conf Rob Autom, 2021, (3374-3380): : 3374 - 3380
  • [3] A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video
    Liu, Junfa
    Rojas, Juan
    Li, Yihui
    Liang, Zhijun
    Guan, Yisheng
    Xi, Ning
    Zhu, Haifei
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 3374 - 3380
  • [4] Multiview Video-Based 3-D Hand Pose Estimation
    Khaleghi L.
    Sepas-Moghaddam A.
    Marshall J.
    Etemad A.
    IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 896 - 909
  • [5] Multibranch Attention Graph Convolutional Networks for 3-D Human Pose Estimation
    Yin, Yanfang
    Liu, Ming
    Zhu, Qigang
    Zhang, Shuaishuai
    Hussien, Naseer Ali
    Fan, Yong
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
  • [6] Modulated Graph Convolutional Network for 3D Human Pose Estimation
    Zou, Zhiming
    Tang, Wei
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11457 - 11467
  • [7] Flexible Graph Convolutional Network for 3D Human Pose Estimation
    Shahjahan, Abu Taib Mohammed
    Hamza, A. Ben
    arXiv,
  • [8] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
    Peng, Sha
    Hu, Jiwei
    Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
  • [9] Geometric Consistency-Guaranteed Spatio-Temporal Transformer for Unsupervised Multiview 3-D Pose Estimation
    Dong, Kaiwen
    Riou, Kevin
    Zhu, Jingwen
    Pastor, Andreas
    Subrin, Kevin
    Zhou, Yu
    Yun, Xiao
    Sun, Yanjing
    Le Callet, Patrick
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
  • [10] Human pose estimation with spatial context relationships based on graph convolutional network
    Han, Na
    PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 1566 - 1570