Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes The Importance of Multiple Scene Constraints

被引:204
|
作者
Zanfir, Andrei [2 ]
Marinoiu, Elisabeta [2 ]
Sminchisescu, Cristian [1 ,2 ]
机构
[1] Lund Univ, Fac Engn, Dept Math, Lund, Sweden
[2] Romanian Acad, Inst Math, Bucharest, Romania
基金
欧盟地平线“2020”; 欧洲研究理事会;
关键词
MOTION CAPTURE;
D O I
10.1109/CVPR.2018.00229
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Human sensing has greatly benefited from recent advances in deep learning, parametric human modeling, and large scale 2d and 3d datasets. However, existing 3d models make strong assumptions about the scene, considering either a single person per image, full views of the person, a simple background or many cameras. In this paper, we leverage state-of-the-art deep multi-task neural networks and parametric human and scene modeling, towards a fully automatic monocular visual sensing system for multiple interacting people, which (i) infers the 2d and 3d pose and shape of multiple people from a single image, relying on detailed semantic representations at both model and image level, to guide a combined optimization with feedforward and feedback components, (ii) automatically integrates scene constraints including ground plane support and simultaneous volume occupancy by multiple people, and (iii) extends the single image model to video by optimally solving the temporal person assignment problem and imposing coherent temporal pose and motion reconstructions while preserving image alignment fidelity. We perform experiments on both single and multi-person datasets, and systematically evaluate each component of the model, showing improved performance and extensive multiple human sensing capability. We also apply our method to images with multiple people, severe occlusions and diverse backgrounds captured in challenging natural scenes, and obtain results of good perceptual quality.
引用
收藏
页码:2148 / 2157
页数:10
相关论文
共 50 条
  • [1] 3D pose estimation based on multiple monocular cues
    Barrois, Bjoern
    Woehler, Christian
    2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 2724 - +
  • [2] Vehicle Pose and Shape Estimation through Multiple Monocular Vision
    Ding, Wenhao
    Li, Shuaijun
    Zhang, Guilin
    Lei, Xiangyu
    Qian, Huihuan
    2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 709 - 715
  • [3] 3D Pose and Shape Estimation with Deformable Models in Lifelike Scenes
    Laubenheimer, Astrid
    Richter, Steffen
    Kroschel, Kristian
    HUMANOIDS: 2007 7TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2007, : 159 - +
  • [4] MONOCULAR 3D HUMAN POSE ESTIMATION BY MULTIPLE HYPOTHESIS PREDICTION AND JOINT ANGLE SUPERVISION
    Panda, Aditya
    Mukherjee, Dipti Prasad
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3243 - 3247
  • [5] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
    Xu, Xiangyu
    Liu, Lijuan
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
  • [6] Personalized Graph Generation for Monocular 3D Human Pose and Shape Estimation
    Hu, Junxing
    Zhang, Hongwen
    Wang, Yunlong
    Ren, Min
    Sun, Zhenan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2399 - 2413
  • [7] SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation
    Jiang, Mengxi
    Yu, Zhuliang
    Li, Cuihua
    Lei, Yunqi
    NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2165 - 2181
  • [8] SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation
    Mengxi Jiang
    Zhuliang Yu
    Cuihua Li
    Yunqi Lei
    Neural Computing and Applications, 2021, 33 : 2165 - 2181
  • [9] Monocular Expressive 3D Human Reconstruction of Multiple People
    Zhao, Zhenghao
    Tang, Hao
    Wan, Joy
    Yan, Yan
    PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 423 - 432
  • [10] Estimation of 3D Information of Multiple Objects in a Monocular Image
    Hirose, Takuma
    Yata, Noriko
    Manabe, Yoshitsugu
    INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2021, 2021, 11766