VoxelTrack: Multi-Person 3D Human Pose Estimation and Tracking in the Wild

被引:33
|
作者
Zhang, Yifu [1 ]
Wang, Chunyu [2 ]
Wang, Xinggang [1 ]
Liu, Wenyu [1 ]
Zeng, Wenjun [2 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan 430074, Peoples R China
[2] Microsoft Res Asia, Beijing 100080, Peoples R China
关键词
3D human pose tracking; volumetric; multiple camera views; NETWORK;
D O I
10.1109/TPAMI.2022.3163709
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present VoxelTrack for multi-person 3D pose estimation and tracking from a few cameras which are separated by wide baselines. It employs a multi-branch network to jointly estimate 3D poses and re-identification (Re-ID) features for all people in the environment. In contrast to previous efforts which require to establish cross-view correspondence based on noisy 2D pose estimates, it directly estimates and tracks 3D poses from a 3D voxel-based representation constructed from multi-view images. We first discretize the 3D space by regular voxels and compute a feature vector for each voxel by averaging the body joint heatmaps that are inversely projected from all views. We estimate 3D poses from the voxel representation by predicting whether each voxel contains a particular body joint. Similarly, a Re-ID feature is computed for each voxel which is used to track the estimated 3D poses over time. The main advantage of the approach is that it avoids making any hard decisions based on individual images. The approach can robustly estimate and track 3D poses even when people are severely occluded in some cameras. It outperforms the state-of-the-art methods by a large margin on four public datasets including Shelf, Campus, Human3.6 M and CMU Panoptic.
引用
下载
收藏
页码:2613 / 2626
页数:14
相关论文
共 50 条
  • [41] Multi-person Pose Estimation for Pose Tracking with Enhanced Cascaded Pyramid Network
    Yu, Dongdong
    Su, Kai
    Sun, Jia
    Wang, Changhu
    COMPUTER VISION - ECCV 2018 WORKSHOPS, PT II, 2019, 11130 : 221 - 226
  • [42] TesseTrack: End-to-End Learnable Multi-Person Articulated 3D Pose Tracking
    Reddy, N. Dinesh
    Guigues, Laurent
    Pishchulin, Leonid
    Eledath, Jayan
    Narasimhan, Srinivasa G.
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 15185 - 15195
  • [43] DEEP, ROBUST AND SINGLE SHOT 3D MULTI-PERSON HUMAN POSE ESTIMATION FROM MONOCULAR IMAGES
    Benzine, Abdallah
    Luvison, Bertrand
    Quoc Cuong Pham
    Achard, Catherine
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 584 - 588
  • [44] Unsupervised Multi-view Multi-person 3D Pose Estimation Using Reprojection Error
    de Franca Silva, Diogenes Wallis
    Do Monte Lima, Joao Paulo Silva
    Macedo, David
    Zanchettin, Cleber
    Thomas, Diego Gabriel Francis
    Uchiyama, Hideaki
    Teichrieb, Veronica
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 : 482 - 494
  • [45] Dual Networks Based 3D Multi-Person Pose Estimation From Monocular Video
    Cheng, Yu
    Wang, Bo
    Tan, Robby T. T.
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 1636 - 1651
  • [46] Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
    Cheng, Yu
    Wang, Bo
    Yang, Bo
    Tan, Robby T.
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1157 - 1165
  • [47] Monocular 3D multi-person pose estimation via predicting factorized correction factors
    Guo, Yu
    Ma, Lichen
    Li, Zhi
    Wang, Xuan
    Wang, Fei
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2021, 213
  • [48] Single-Shot Multi-Person 3D Pose Estimation From Monocular RGB
    Mehta, Dushyant
    Sotnychenko, Oleksandr
    Mueller, Franziska
    Xu, Weipeng
    Sridhar, Srinath
    Pons-Moll, Gerard
    Theobalt, Christian
    2018 INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2018, : 120 - 130
  • [49] Multi-person 3D pose estimation from a single image captured by a fisheye camera
    Zhang, Yahui
    You, Shaodi
    Karaoglu, Sezer
    Gevers, Theo
    COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 222
  • [50] Multi-person 3D tracking with particle filters on voxels
    Lopez, A.
    Canton-Ferrer, C.
    Casas, J. R.
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PTS 1-3, PROCEEDINGS, 2007, : 913 - 916