Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos

被引:0
|
作者
Cheng, Yu [1 ]
Wang, Bo [2 ]
Yang, Bo [2 ]
Tan, Robby T. [1 ,3 ]
机构
[1] Natl Univ Singapore, Singapore, Singapore
[2] Tencent Game AI Res Ctr, Shenzhen, Peoples R China
[3] Yale NUS Coll, Singapore, Singapore
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite the recent progress, 3D multi-person pose estimation from monocular videos is still challenging due to the commonly encountered problem of missing information caused by occlusion, partially out-of-frame target persons, and inaccurate person detection. To tackle this problem, we propose a novel framework integrating graph convolutional networks (GCNs) and temporal convolutional networks (TCNs) to robustly estimate camera-centric multi-person 3D poses that does not require camera parameters. In particular, we introduce a human joint GCN, which unlike the existing GCN, is based on a directed graph that employs the 2D pose estimator's confidence scores to improve the pose estimation results. We also introduce a human-bone GCN, which models the bone connections and provides more information beyond human joints. The two GCNs work together to estimate the spatial frame-wise 3D poses, and can make use of both visible joint and bone information in the target frame to estimate the occluded or missing human-part information. To further refine the 3D pose estimation, we use our temporal convolutional networks (TCNs) to enforce the temporal and human-dynamics constraints. We use a joint-TCN to estimate person-centric 3D poses across frames, and propose a velocity-TCN to estimate the speed of 3D joints to ensure the consistency of the 3D pose estimation in consecutive frames. Finally, to estimate the 3D human poses for multiple persons, we propose a root-TCN that estimates camera-centric 3D poses without requiring camera parameters. Quantitative and qualitative evaluations demonstrate the effectiveness of the proposed method. Our code and models are available at https://github.com/3dpose/GnTCN.
引用
收藏
页码:1157 / 1165
页数:9
相关论文
共 50 条
  • [1] Multi-Person Hierarchical 3D Pose Estimation in Natural Videos
    Gu, Renshu
    Wang, Gaoang
    Jiang, Zhongyu
    Hwang, Jenq-Neng
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (11) : 4245 - 4257
  • [2] Towards Robust and Smooth 3D Multi-Person Pose Estimation from Monocular Videos in the Wild
    Park, Sungchan
    You, Eunyi
    Lee, Inhoe
    Lee, Joonseok
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14726 - 14736
  • [3] Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video
    Cheng, Yu
    Wang, Bo
    Tan, Robby T. T.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 1636 - 1651
  • [4] Multi-person 3D Pose Estimation from Monocular Image Sequences
    Li, Ran
    Xu, Nayun
    Lu, Xutong
    Xing, Yucheng
    Zhao, Haohua
    Niu, Li
    Zhang, Liqing
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2019), PT II, 2019, 11954 : 15 - 24
  • [5] Multi-Person 3D Human Pose Estimation from Monocular Images
    Dabral, Rishabh
    Gundavarapu, Nitesh B.
    Mitra, Rahul
    Sharma, Abhishek
    Ramakrishnan, Ganesh
    Jain, Arjun
    [J]. 2019 INTERNATIONAL CONFERENCE ON 3D VISION (3DV 2019), 2019, : 405 - 414
  • [6] Mutual Adaptive Reasoning for Monocular 3D Multi-Person Pose Estimation
    Zhang, Juze
    Wang, Jingya
    Shi, Ye
    Gao, Fei
    Xu, Lan
    Yu, Jingyi
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 1788 - 1796
  • [7] Dynamic Graph Reasoning for Multi-person 3D Pose Estimation
    Qiu, Zhongwei
    Yang, Qiansheng
    Wang, Jian
    Fu, Dongmei
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 3521 - 3529
  • [8] Top-Down System for Multi-Person 3D Absolute Pose Estimation from Monocular Videos
    El Kaid, Amal
    Brazey, Denis
    Barra, Vincent
    Baina, Karim
    [J]. SENSORS, 2022, 22 (11)
  • [9] MMDA: Multi-person marginal distribution awareness for monocular 3D pose estimation
    Liu, Sheng
    Shuai, Jianghai
    Li, Yang
    Du, Sidan
    [J]. IET IMAGE PROCESSING, 2023, 17 (07) : 2182 - 2191
  • [10] PI-Net: Pose Interacting Network for Multi-Person Monocular 3D Pose Estimation
    Guo, Wen
    Corona, Enric
    Moreno-Noguer, Francesc
    Alameda-Pineda, Xavier
    [J]. 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2795 - 2805