Monocular Expressive 3D Human Reconstruction of Multiple People

被引:0
|
作者
Zhao, Zhenghao [1 ]
Tang, Hao [2 ]
Wan, Joy [3 ]
Yan, Yan [1 ]
机构
[1] Illinois Inst Technol, Chicago, IL 60616 USA
[2] Carnegie Mellon Univ, Pittsburgh, PA USA
[3] Univ Illinois, Urbana, IL USA
关键词
3D Pose Estimation; Whole-body Pose Estimation; Multi-person Pose Estimation;
D O I
10.1145/3652583.3658092
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Whole-body pose estimation aims to regress human pose models that include the body, hand, and facial details from RGB images. While the task of whole-body mesh recovery has been extensively studied in recent literature, the focus has predominantly been on human mesh recovery for a single person, despite the frequent occurrence of multiple people in practical scenarios. Similar to body-only cases, such single-person whole-body pose estimation methods often fail in the multiple-people problem for two reasons: (i) Given the ambiguous bounding box, which could contain more than one instance, it is difficult for single-person-oriented methods to regress the body mesh model of the target person. (ii) Single-person pose estimation approaches neglect the person-person occlusions and the depth order among instances, thus generating interpenetrated models. In this paper, we propose the Multi-person Expressive POse (MEPO) model, which exploits expressive 3D human model reconstruction for multiple people. To our best knowledge, our model is the first multi-person whole-body mesh reconstruction model, which is intensified by heatmap, depthmap, and depth order loss. We propose the Heatmap Enhancement Net (HENet) to leverage the heatmap information to assist the model in concentrating on the target person in crowded multi-person cases, while the depthmap delivers depth information of the image. Furthermore, we impose a depth order loss to recover human mesh precisely for overlapped people. In our experiments, we evaluate our model on multiple challenging datasets, including AGORA, which consists of complex occlusions similar to real-world scenarios. Our method has a significant performance improvement compared with the state-of-the-art pose estimation methods.
引用
收藏
页码:423 / 432
页数:10
相关论文
共 50 条
  • [41] A Perceptual Shape Loss for Monocular 3D Face Reconstruction
    Otto, C.
    Chandran, P.
    Zoss, G.
    Gross, M.
    Gotardo, P.
    Bradley, D.
    COMPUTER GRAPHICS FORUM, 2023, 42 (07)
  • [42] Monocular Panoramic 3D Reconstruction Based on a Particle Filter
    Pagel, Frank
    UNMANNED SYSTEMS TECHNOLOGY XII, 2010, 7692
  • [43] Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing
    Alldieck, Thiemo
    Zanfir, Mihai
    Sminchisescu, Cristian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1496 - 1505
  • [44] XAGen: 3D Expressive Human Avatars Generation
    Xu, Zhongcong
    Zhang, Jianfeng
    Liew, Jun Hao
    Feng, Jiashi
    Shou, Mike Zheng
    arXiv, 2023,
  • [45] Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction
    Nam, Hyeongjin
    Jung, Daniel Sungho
    Oh, Yeonguk
    Lee, Kyoung Mu
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 14783 - 14793
  • [46] XAGen: 3D Expressive Human Avatars Generation
    Xu, Zhongcong
    Zhang, Jianfeng
    Liew, Jun Hao
    Feng, Jiashi
    Shou, Mike Zheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [47] DUST: Dual Union of Spatio-Temporal Subspaces for Monocular Multiple Object 3D Reconstruction
    Agudo, Antonio
    Moreno-Noguer, Francesc
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1513 - 1521
  • [48] A survey on monocular 3D human pose estimation
    Ji X.
    Fang Q.
    Dong J.
    Shuai Q.
    Jiang W.
    Zhou X.
    Virtual Reality and Intelligent Hardware, 2020, 2 (06): : 471 - 500
  • [49] Lifting Monocular Events to 3D Human Poses
    Scarpellini, Gianluca
    Morerio, Pietro
    Del Bue, Alessio
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 1358 - 1368
  • [50] MONOCULAR 3D HUMAN POSE ESTIMATION BY CLASSIFICATION
    Greif, Thomas
    Lienhart, Rainer
    Sengupta, Debabrata
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,