DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation

被引:22
|
作者
Zhang, Hongwen [1 ,2 ,3 ]
Cao, Jie [1 ,2 ,3 ]
Lu, Guo [4 ]
Ouyang, Wanli [5 ]
Sun, Zhenan [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, Inst Automat, CRIPAC, Beijing, Peoples R China
[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China
[3] Univ Chinese Acad Sci, Beijing, Peoples R China
[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China
[5] Univ Sydney, SenseTime Comp Vis Res Grp, Sydney, NSW, Australia
基金
中国国家自然科学基金;
关键词
Decompose-and-aggregate Network; 3D human shape and pose estimation; position-aided rotation feature refinement;
D O I
10.1145/3343031.3351057
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Reconstructing 3D human shape and pose from a monocular image is challenging despite the promising results achieved by most recent learning based methods. The commonly occurred misalignment comes from the facts that the mapping from image to model space is highly non-linear and the rotation-based pose representation of the body model is prone to result in drift of joint positions. In this work, we present the Decompose-and-aggregate Network (DaNet) to address these issues. DaNet includes three new designs, namely UVI guided learning, decomposition for fine-grained perception, and aggregation for robust prediction. First, we adopt the UVI maps, which densely build a bridge between 2D pixels and 3D vertexes, as an intermediate representation to facilitate the learning of image-to-model mapping. Second, we decompose the prediction task into one global stream and multiple local streams so that the network not only provides global perception for the camera and shape prediction, but also has detailed perception for part pose prediction. Lastly, we aggregate the message from local streams to enhance the robustness of part pose prediction, where a position-aided rotation feature refinement strategy is proposed to exploit the spatial relationship between body parts. Such a refinement strategy is more efficient since the correlations between position features are stronger than that in the original rotation feature space. The effectiveness of our method is validated on the Human3.6M and UP-3D datasets. Experimental results show that the proposed method significantly improves the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at https://github.com/HongwenZhang/DaNet-3DHumanReconstrution.
引用
收藏
页码:935 / 944
页数:10
相关论文
共 50 条
  • [21] Iterative graph filtering network for 3D human pose estimation
    Islam, Zaedul
    Ben Hamza, A.
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
  • [22] Iterative Graph Filtering Network for 3D Human Pose Estimation
    Islam, Zaedul
    Ben Hamza, A.
    arXiv, 2023,
  • [23] Regular Splitting Graph Network for 3D Human Pose Estimation
    Hassan, Md. Tanvir
    Ben Hamza, A.
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4212 - 4222
  • [24] Staged cascaded network for monocular 3D human pose estimation
    Gao, Bing-kun
    Zhang, Zhong-xin
    Wu, Cui-na
    Wu, Chen-lei
    Bi, Hong-bo
    APPLIED INTELLIGENCE, 2023, 53 (01) : 1021 - 1029
  • [25] POCO: 3D Pose and Shape Estimation with Confidence
    Dwivedi, Sai Kumar
    Schmid, Cordelia
    Yi, Hongwei
    Black, Michael J.
    Tzionas, Dimitrios
    2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 85 - 95
  • [26] Generative estimation of 3D human pose using shape contexts matching
    Zhao, Xu
    Liu, Yuncai
    COMPUTER VISION - ACCV 2007, PT I, PROCEEDINGS, 2007, 4843 : 419 - 429
  • [27] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
    Xu, Xiangyu
    Liu, Lijuan
    Yan, Shuicheng
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
  • [28] Reducing Depth Ambiguity in 3D Human Pose and Body Shape Estimation
    Maruyama, Gakuto
    Kaneko, Naoshi
    Ito, Seiya
    Sumi, Kazuhiko
    FIFTEENTH INTERNATIONAL CONFERENCE ON QUALITY CONTROL BY ARTIFICIAL VISION, 2021, 11794
  • [29] EventHPE: Event-based 3D Human Pose and Shape Estimation
    Zou, Shihao
    Guo, Chuan
    Zuo, Xinxin
    Wang, Sen
    Wang, Pengyu
    Hu, Xiaoqin
    Chen, Shoushun
    Gong, Minglun
    Cheng, Li
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10976 - 10985
  • [30] Sequential 3D Human Pose and Shape Estimation from Point Clouds
    Wang, Kangkan
    Xie, Jin
    Zhang, Guofeng
    Liu, Lei
    Yang, Jian
    2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7273 - 7282