Estimating 3D Motion and Forces of Human–Object Interactions from Internet Videos

被引:0
|
作者
Zongmian Li
Jiri Sedlar
Justin Carpentier
Ivan Laptev
Nicolas Mansard
Josef Sivic
机构
[1] PSL Research University,Département d’informatique de l’ENS, École normale supérieure, CNRS
[2] Inria Paris,Willow Project
[3] Czech Technical University,Czech Institute of Informatics, Robotics and Cybernetics
[4] Université de Toulouse,LAAS
[5] Artifical and Natural Intelligence Toulouse Insitute (ANITI),CNRS, CNRS
来源
关键词
Single-view 3D pose estimation; Force estimation; Person–object interaction; Instructional video; Contact recognition; Motion capture;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we introduce a method to automatically reconstruct the 3D motion of a person interacting with an object from a single RGB video. Our method estimates the 3D poses of the person together with the object pose, the contact positions and the contact forces exerted on the human body. The main contributions of this work are three-fold. First, we introduce an approach to jointly estimate the motion and the actuation forces of the person on the manipulated object by modeling contacts and the dynamics of the interactions. This is cast as a large-scale trajectory optimization problem. Second, we develop a method to automatically recognize from the input video the 2D position and timing of contacts between the person and the object or the ground, thereby significantly simplifying the complexity of the optimization. Third, we validate our approach on a recent video + MoCap dataset capturing typical parkour actions, and demonstrate its performance on a new dataset of Internet videos showing people manipulating a variety of tools in unconstrained environments.
引用
收藏
页码:363 / 383
页数:20
相关论文
共 50 条
  • [1] Estimating 3D Motion and Forces of Human-Object Interactions from Internet Videos
    Li, Zongmian
    Sedlar, Jiri
    Carpentier, Justin
    Laptev, Ivan
    Mansard, Nicolas
    Sivic, Josef
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2022, 130 (02) : 363 - 383
  • [2] Estimating 3D Motion and Forces of Person-Object Interactions from Monocular Video
    Li, Zongmian
    Sedlar, Jiri
    Carpentier, Justin
    Laptev, Ivan
    Mansard, Nicolas
    Sivic, Josef
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8632 - 8641
  • [3] Understanding 3D Object Articulation in Internet Videos
    Qian, Shengyi
    Jin, Linyi
    Rockwell, Chris
    Chen, Siyi
    Fouhey, David F.
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1589 - 1599
  • [4] Articulated 3D Human-Object Interactions From RGB Videos: An Empirical Analysis of Approaches and Challenges
    Haresh, Sanjay
    Sun, Xiaohao
    Jiang, Hanxiao
    Chang, Angel X.
    Savva, Manolis
    [J]. 2022 INTERNATIONAL CONFERENCE ON 3D VISION, 3DV, 2022, : 312 - 321
  • [5] Improving Dynamic 3D Gaussian Splatting from Monocular Videos with Object Motion Information
    Luo, Yixin
    Huang, Zhangjin
    Huang, Xudong
    [J]. ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT XI, ICIC 2024, 2024, 14872 : 84 - 95
  • [6] Joint 3D Human Motion Capture and Physical Analysis from Monocular Videos
    Zell, Petrissa
    Wandt, Bastian
    Rosenhahn, Bodo
    [J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 17 - 26
  • [7] Estimating the Coverage in 3D Reconstructions of the Colon from Colonoscopy Videos
    Muhlethaler, Emmanuelle
    Posner, Erez
    Bouhnik, Moshe
    [J]. IMAGING SYSTEMS FOR GI ENDOSCOPY, AND GRAPHS IN BIOMEDICAL IMAGE ANALYSIS, ISGIE 2022, 2022, 13754 : 56 - 65
  • [8] Unsupervised Learning of 3D Object Categories from Videos in the Wild
    Henzler, Philipp
    Reizenstein, Jeremy
    Labatut, Patrick
    Shapovalov, Roman
    Ritschel, Tobias
    Vedaldi, Andrea
    Novotny, David
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 4698 - 4707
  • [9] 3D Object Reconstruction from Hand-Object Interactions
    Tzionas, Dimitrios
    Gall, Juergen
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 729 - 737
  • [10] Estimating 3D object coordinates from markerless scenes
    Kwon, KW
    Baik, SW
    Lee, SW
    [J]. COMPUTATIONAL SCIENCE - ICCS 2005, PT 3, 2005, 3516 : 850 - 853