Monocular Expressive 3D Human Reconstruction of Multiple People

被引：0

作者：

Zhao, Zhenghao ^{[1
]}

Tang, Hao ^{[2
]}

Wan, Joy ^{[3
]}

Yan, Yan ^{[1
]}

机构：

[1] Illinois Inst Technol, Chicago, IL 60616 USA

[2] Carnegie Mellon Univ, Pittsburgh, PA USA

[3] Univ Illinois, Urbana, IL USA

来源：

PROCEEDINGS OF THE 4TH ANNUAL ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA RETRIEVAL, ICMR 2024 | 2024年

关键词：

3D Pose Estimation; Whole-body Pose Estimation; Multi-person Pose Estimation;

D O I：

10.1145/3652583.3658092

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Whole-body pose estimation aims to regress human pose models that include the body, hand, and facial details from RGB images. While the task of whole-body mesh recovery has been extensively studied in recent literature, the focus has predominantly been on human mesh recovery for a single person, despite the frequent occurrence of multiple people in practical scenarios. Similar to body-only cases, such single-person whole-body pose estimation methods often fail in the multiple-people problem for two reasons: (i) Given the ambiguous bounding box, which could contain more than one instance, it is difficult for single-person-oriented methods to regress the body mesh model of the target person. (ii) Single-person pose estimation approaches neglect the person-person occlusions and the depth order among instances, thus generating interpenetrated models. In this paper, we propose the Multi-person Expressive POse (MEPO) model, which exploits expressive 3D human model reconstruction for multiple people. To our best knowledge, our model is the first multi-person whole-body mesh reconstruction model, which is intensified by heatmap, depthmap, and depth order loss. We propose the Heatmap Enhancement Net (HENet) to leverage the heatmap information to assist the model in concentrating on the target person in crowded multi-person cases, while the depthmap delivers depth information of the image. Furthermore, we impose a depth order loss to recover human mesh precisely for overlapped people. In our experiments, we evaluate our model on multiple challenging datasets, including AGORA, which consists of complex occlusions similar to real-world scenarios. Our method has a significant performance improvement compared with the state-of-the-art pose estimation methods.

引用

页码：423 / 432

页数：10

共 50 条

[41] A Perceptual Shape Loss for Monocular 3D Face Reconstruction
Otto, C.
Chandran, P.
Zoss, G.
Gross, M.
Gotardo, P.
Bradley, D.
COMPUTER GRAPHICS FORUM, 2023, 42 (07)
[42] Monocular Panoramic 3D Reconstruction Based on a Particle Filter
Pagel, Frank
UNMANNED SYSTEMS TECHNOLOGY XII, 2010, 7692
[43] Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
Alldieck, Thiemo
Zanfir, Mihai
Sminchisescu, Cristian
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1496 - 1505
[44] XAGen: 3D Expressive Human Avatars Generation
Xu, Zhongcong
Zhang, Jianfeng
Liew, Jun Hao
Feng, Jiashi
Shou, Mike Zheng
arXiv, 2023,
[45] Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction
Nam, Hyeongjin
Jung, Daniel Sungho
Oh, Yeonguk
Lee, Kyoung Mu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14783 - 14793
[46] XAGen: 3D Expressive Human Avatars Generation
Xu, Zhongcong
Zhang, Jianfeng
Liew, Jun Hao
Feng, Jiashi
Shou, Mike Zheng
ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
[47] DUST: Dual Union of Spatio-Temporal Subspaces for Monocular Multiple Object 3D Reconstruction
Agudo, Antonio
Moreno-Noguer, Francesc
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1513 - 1521
[48] A survey on monocular 3D human pose estimation
Ji X.
Fang Q.
Dong J.
Shuai Q.
Jiang W.
Zhou X.
Virtual Reality and Intelligent Hardware, 2020, 2 (06): : 471 - 500
[49] Lifting Monocular Events to 3D Human Poses
Scarpellini, Gianluca
Morerio, Pietro
Del Bue, Alessio
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1358 - 1368
[50] MONOCULAR 3D HUMAN POSE ESTIMATION BY CLASSIFICATION
Greif, Thomas
Lienhart, Rainer
Sengupta, Debabrata
2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,

← 1 2 3 4 5 →