Spatial-Temporal-Geometric Graph Convolutional Network for 3-D Human Pose Estimation From Multiview Video

被引：0

作者：

Dong, Kaiwen ^{[1
]}

Zhou, Yu ^{[2
]}

Riou, Kevin ^{[3
]}

Yun, Xiao ^{[2
]}

Sun, Yanjing ^{[2
]}

Subrin, Kevin ^{[3
]}

Le Callet, Patrick ^{[3
,4
]}

机构：

[1] China Univ Min & Technol, IOT Percept Mine Res Ctr, Xuzhou 221008, Peoples R China

[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou 221116, Peoples R China

[3] Nantes Univ, Ecole Cent Nantes, CNRS, UMR 6004,LS2N, F-44306 Nantes, France

[4] Inst Univ France IUF, F-75005 Paris, France

来源：

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT | 2025年 / 74卷

基金：

中国国家自然科学基金;

关键词：

Three-dimensional displays; Cameras; Pose estimation; Convolution; Computational complexity; Graph convolutional networks; Computational modeling; Training; Sun; Solid modeling; 3-D pose estimation; confidence-aware attention (CAA); graph convolutional network (GCN); multiview;

D O I：

10.1109/TIM.2025.3551025

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The multiview 3-D human pose estimation (HPE) effectively addresses challenges, such as depth ambiguity and occlusion faced by monocular methods through the complementing of geometric information from multiple views. However, existing multiview methods often necessitate well-calibrated camera parameters or rely on complex parametric models. These requirements can result in inaccuracies when camera placement is perturbed and can negatively impact the deployability. This article proposes a lightweight approach that synergistically models geometric information with spatial-temporal information without relying on camera parameters, named spatial-temporal-geometric graph convolutional network (STG-GCN). We leverage the inherent connections in multiview sequences of 2-D poses, representing them as a spatial-temporal-geometric graph (STG-Graph), which allows for the simultaneous encoding of spatial-temporal-geometric relations across various joints, consecutive frames, and multiple views. Using a unified graph to model all features, this approach reduces the parameter explosion in existing methods, caused by separate modules extracting spatial, temporal, and view axis features. Building upon the STG-Graph, an adaptive confidence-aware graph convolution (ACA-GraphConv) is proposed to mitigate the impact of unreliable 2-D poses predicted by 2-D pose estimators. This is achieved by leveraging corresponding confidence scores to adjust the graph convolution accordingly. Experimental results on two public datasets demonstrate that our STG-GCN achieves performance comparable to state-of-the-art approaches while significantly reducing parameter volume. Ablation studies also illustrate the effectiveness of our ACA-GraphConv in both monocular and multiview scenarios.

引用

页数：13

共 50 条

[1] Human Pose Estimation Based on a Spatial Temporal Graph Convolutional Network
Wu, Meng
Shi, Pudong
APPLIED SCIENCES-BASEL, 2023, 13 (05):
[2] A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video
The Biomimetic and Intelligent Robotics Lab , School of Electromechanical Engineering, Guangdong University of Technology, Guangzhou
510006, China
不详
不详
Proc IEEE Int Conf Rob Autom, 2021, (3374-3380): : 3374 - 3380
[3] A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video
Liu, Junfa
Rojas, Juan
Li, Yihui
Liang, Zhijun
Guan, Yisheng
Xi, Ning
Zhu, Haifei
2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 3374 - 3380
[4] Multiview Video-Based 3-D Hand Pose Estimation
Khaleghi L.
Sepas-Moghaddam A.
Marshall J.
Etemad A.
IEEE Transactions on Artificial Intelligence, 2023, 4 (04): : 896 - 909
[5] Multibranch Attention Graph Convolutional Networks for 3-D Human Pose Estimation
Yin, Yanfang
Liu, Ming
Zhu, Qigang
Zhang, Shuaishuai
Hussien, Naseer Ali
Fan, Yong
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2023, 72
[6] Modulated Graph Convolutional Network for 3D Human Pose Estimation
Zou, Zhiming
Tang, Wei
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 11457 - 11467
[7] Flexible Graph Convolutional Network for 3D Human Pose Estimation
Shahjahan, Abu Taib Mohammed
Hamza, A. Ben
arXiv,
[8] 3D Human Pose Estimation in Video with Temporal and Spatial Transformer
Peng, Sha
Hu, Jiwei
Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12707
[9] Geometric Consistency-Guaranteed Spatio-Temporal Transformer for Unsupervised Multiview 3-D Pose Estimation
Dong, Kaiwen
Riou, Kevin
Zhu, Jingwen
Pastor, Andreas
Subrin, Kevin
Zhou, Yu
Yun, Xiao
Sun, Yanjing
Le Callet, Patrick
IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2024, 73
[10] Human pose estimation with spatial context relationships based on graph convolutional network
Han, Na
PROCEEDINGS OF 2020 IEEE 5TH INFORMATION TECHNOLOGY AND MECHATRONICS ENGINEERING CONFERENCE (ITOEC 2020), 2020, : 1566 - 1570

← 1 2 3 4 5 →