PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

被引:106
|
作者
Shimada, Soshi [1 ]
Golyanik, Vladislav [1 ]
Xu, Weipeng [2 ]
Theobalt, Christian [1 ]
机构
[1] Max Planck Inst Informat, Saarland Informat Campus, Saarbrucken, Germany
[2] Facebook Real Labs, Pittsburgh, PA USA
来源
ACM TRANSACTIONS ON GRAPHICS | 2020年 / 39卷 / 06期
基金
欧洲研究理事会; 欧盟地平线“2020”;
关键词
Monocular Motion Capture; Physics-Based Constraints; Real Time; Human Body; Global; 3D;
D O I
10.1145/3414685.3417877
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Marker-less 3D human motion capture from a single colour camera has seen significant progress. However, it is a very challenging and severely ill-posed problem. In consequence, even the most accurate state-of-the-art approaches have significant limitations. Purely kinematic formulations on the basis of individual joints or skeletons, and the frequent frame-wise reconstruction in state-of-the-art methods greatly limit 3D accuracy and temporal stability compared to multi-view or marker-based motion capture. Further, captured 3D poses are often physically incorrect and biomechanically implausible, or exhibit implausible environment interactions (floor penetration, foot skating, unnatural body leaning and strong shifting in depth), which is problematic for any use case in computer graphics. We, therefore, present PhysCap, the first algorithm for physically plausible, real-time and marker-less human 3D motion capture with a single colour camera at 25 fps. Our algorithm first captures 3D human poses purely kinematically. To this end, a CNN infers 2D and 3D joint positions, and subsequently, an inverse kinematics step finds space-time coherent joint angles and global 3D pose. Next, these kinematic reconstructions are used as constraints in a real-time physics-based pose optimiser that accounts for environment constraints (e.g., collision handling and floor placement), gravity, and biophysical plausibility of human postures. Our approach employs a combination of ground reaction force and residual force for plausible root control, and uses a trained neural network to detect foot contact events in images. Our method captures physically plausible and temporally stable global 3D human motion, without physically implausible postures, floor penetrations or foot skating, from video in real time and in general scenes. PhysCap achieves state-of-the-art accuracy on established pose benchmarks, and we propose new metrics to demonstrate the improved physical plausibility and temporal stability.
引用
收藏
页数:16
相关论文
共 50 条
  • [1] Real-time 3D motion capture by monocular vision and virtual rendering
    David Antonio Gómez Jáuregui
    Patrick Horain
    Machine Vision and Applications, 2017, 28 : 839 - 858
  • [2] Real-Time 3D Motion Capture by Monocular Vision and Virtual Rendering
    Jauregui, David Antonio Gomez
    Horain, Patrick
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7585 : 663 - 666
  • [3] Real-time 3D motion capture by monocular vision and virtual rendering
    Jauregui, David Antonio Gomez
    Horain, Patrick
    MACHINE VISION AND APPLICATIONS, 2017, 28 (08) : 839 - 858
  • [4] 3D Copy-Paste: Physically Plausible Object Insertion for Monocular 3D Detection
    Ge, Yunhao
    Yu, Hong-Xing
    Zhao, Cheng
    Guo, Yuliang
    Huang, Xinyu
    Ren, Liu
    Itti, Laurent
    Wu, Jiajun
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36, NEURIPS 2023, 2023,
  • [5] MoCapDeform: Monocular 3D Human Motion Capture in Deformable Scenes
    Li, Zhi
    Shimada, Soshi
    Schiele, Bernt
    Theobalt, Christian
    Golyanik, Vladislav
    2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 1 - 11
  • [6] Neural Monocular 3D Human Motion Capture with Physical Awareness
    Shimada, Soshi
    Golyanik, Vladislav
    Xu, Weipeng
    Perez, Patrick
    Theobalt, Christian
    ACM TRANSACTIONS ON GRAPHICS, 2021, 40 (04):
  • [7] 3D Human Motion Capture from Monocular Image Sequences
    Wandt, Bastian
    Ackermann, Hanno
    Rosenhahn, Bodo
    2015 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2015,
  • [8] Real-time 3D human pose and motion reconstruction from monocular RGB videos
    Yiannakides, Anastasios
    Aristidou, Andreas
    Chrysanthou, Yiorgos
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2019, 30 (3-4)
  • [9] Neural MoCon: Neural Motion Control for Physically Plausible Human Motion Capture
    Huang, Buzhen
    Pan, Liang
    Yang, Yuan
    Ju, Jingyi
    Wang, Yangang
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 6407 - 6416
  • [10] Synthesizing Physically Plausible Human Motions in 3D Scenes
    Pan, Liang
    Wang, Jingbo
    Huang, Buzhen
    Zhang, Junyu
    Wang, Haofan
    Tang, Xu
    Wang, Yangang
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 1498 - 1507