Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes The Importance of Multiple Scene Constraints

被引：204

作者：

Zanfir, Andrei ^{[2
]}

Marinoiu, Elisabeta ^{[2
]}

Sminchisescu, Cristian ^{[1
,2
]}

机构：

[1] Lund Univ, Fac Engn, Dept Math, Lund, Sweden

[2] Romanian Acad, Inst Math, Bucharest, Romania

来源：

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年

基金：

欧盟地平线“2020”; 欧洲研究理事会;

关键词：

MOTION CAPTURE;

D O I：

10.1109/CVPR.2018.00229

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Human sensing has greatly benefited from recent advances in deep learning, parametric human modeling, and large scale 2d and 3d datasets. However, existing 3d models make strong assumptions about the scene, considering either a single person per image, full views of the person, a simple background or many cameras. In this paper, we leverage state-of-the-art deep multi-task neural networks and parametric human and scene modeling, towards a fully automatic monocular visual sensing system for multiple interacting people, which (i) infers the 2d and 3d pose and shape of multiple people from a single image, relying on detailed semantic representations at both model and image level, to guide a combined optimization with feedforward and feedback components, (ii) automatically integrates scene constraints including ground plane support and simultaneous volume occupancy by multiple people, and (iii) extends the single image model to video by optimally solving the temporal person assignment problem and imposing coherent temporal pose and motion reconstructions while preserving image alignment fidelity. We perform experiments on both single and multi-person datasets, and systematically evaluate each component of the model, showing improved performance and extensive multiple human sensing capability. We also apply our method to images with multiple people, severe occlusions and diverse backgrounds captured in challenging natural scenes, and obtain results of good perceptual quality.

引用

页码：2148 / 2157

页数：10

共 50 条

[1] 3D pose estimation based on multiple monocular cues
Barrois, Bjoern
Woehler, Christian
2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 2724 - +
[2] Vehicle Pose and Shape Estimation through Multiple Monocular Vision
Ding, Wenhao
Li, Shuaijun
Zhang, Guilin
Lei, Xiangyu
Qian, Huihuan
2018 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2018, : 709 - 715
[3] 3D Pose and Shape Estimation with Deformable Models in Lifelike Scenes
Laubenheimer, Astrid
Richter, Steffen
Kroschel, Kristian
HUMANOIDS: 2007 7TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2007, : 159 - +
[4] MONOCULAR 3D HUMAN POSE ESTIMATION BY MULTIPLE HYPOTHESIS PREDICTION AND JOINT ANGLE SUPERVISION
Panda, Aditya
Mukherjee, Dipti Prasad
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 3243 - 3247
[5] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
Xu, Xiangyu
Liu, Lijuan
Yan, Shuicheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
[6] Personalized Graph Generation for Monocular 3D Human Pose and Shape Estimation
Hu, Junxing
Zhang, Hongwen
Wang, Yunlong
Ren, Min
Sun, Zhenan
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 2399 - 2413
[7] SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation
Jiang, Mengxi
Yu, Zhuliang
Li, Cuihua
Lei, Yunqi
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (07): : 2165 - 2181
[8] SDM3d: shape decomposition of multiple geometric priors for 3D pose estimation
Mengxi Jiang
Zhuliang Yu
Cuihua Li
Yunqi Lei
Neural Computing and Applications, 2021, 33 : 2165 - 2181
[9] Monocular Expressive 3D Human Reconstruction of Multiple People
Zhao, Zhenghao
Tang, Hao
Wan, Joy
Yan, Yan
PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024, 2024, : 423 - 432
[10] Estimation of 3D Information of Multiple Objects in a Monocular Image
Hirose, Takuma
Yata, Noriko
Manabe, Yoshitsugu
INTERNATIONAL WORKSHOP ON ADVANCED IMAGING TECHNOLOGY (IWAIT) 2021, 2021, 11766

← 1 2 3 4 5 →