DaNet: Decompose-and-aggregate Network for 3D Human Shape and Pose Estimation

被引：22

作者：

Zhang, Hongwen ^{[1
,2
,3
]}

Cao, Jie ^{[1
,2
,3
]}

Lu, Guo ^{[4
]}

Ouyang, Wanli ^{[5
]}

Sun, Zhenan ^{[1
,2
,3
]}

机构：

[1] Chinese Acad Sci, Inst Automat, CRIPAC, Beijing, Peoples R China

[2] Chinese Acad Sci, Inst Automat, NLPR, Beijing, Peoples R China

[3] Univ Chinese Acad Sci, Beijing, Peoples R China

[4] Shanghai Jiao Tong Univ, Shanghai, Peoples R China

[5] Univ Sydney, SenseTime Comp Vis Res Grp, Sydney, NSW, Australia

来源：

PROCEEDINGS OF THE 27TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA (MM'19) | 2019年

基金：

中国国家自然科学基金;

关键词：

Decompose-and-aggregate Network; 3D human shape and pose estimation; position-aided rotation feature refinement;

D O I：

10.1145/3343031.3351057

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Reconstructing 3D human shape and pose from a monocular image is challenging despite the promising results achieved by most recent learning based methods. The commonly occurred misalignment comes from the facts that the mapping from image to model space is highly non-linear and the rotation-based pose representation of the body model is prone to result in drift of joint positions. In this work, we present the Decompose-and-aggregate Network (DaNet) to address these issues. DaNet includes three new designs, namely UVI guided learning, decomposition for fine-grained perception, and aggregation for robust prediction. First, we adopt the UVI maps, which densely build a bridge between 2D pixels and 3D vertexes, as an intermediate representation to facilitate the learning of image-to-model mapping. Second, we decompose the prediction task into one global stream and multiple local streams so that the network not only provides global perception for the camera and shape prediction, but also has detailed perception for part pose prediction. Lastly, we aggregate the message from local streams to enhance the robustness of part pose prediction, where a position-aided rotation feature refinement strategy is proposed to exploit the spatial relationship between body parts. Such a refinement strategy is more efficient since the correlations between position features are stronger than that in the original rotation feature space. The effectiveness of our method is validated on the Human3.6M and UP-3D datasets. Experimental results show that the proposed method significantly improves the reconstruction performance in comparison with previous state-of-the-art methods. Our code is publicly available at https://github.com/HongwenZhang/DaNet-3DHumanReconstrution.

引用

页码：935 / 944

页数：10

共 50 条

[21] Iterative graph filtering network for 3D human pose estimation
Islam, Zaedul
Ben Hamza, A.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 95
[22] Iterative Graph Filtering Network for 3D Human Pose Estimation
Islam, Zaedul
Ben Hamza, A.
arXiv, 2023,
[23] Regular Splitting Graph Network for 3D Human Pose Estimation
Hassan, Md. Tanvir
Ben Hamza, A.
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2023, 32 : 4212 - 4222
[24] Staged cascaded network for monocular 3D human pose estimation
Gao, Bing-kun
Zhang, Zhong-xin
Wu, Cui-na
Wu, Chen-lei
Bi, Hong-bo
APPLIED INTELLIGENCE, 2023, 53 (01) : 1021 - 1029
[25] POCO: 3D Pose and Shape Estimation with Confidence
Dwivedi, Sai Kumar
Schmid, Cordelia
Yi, Hongwei
Black, Michael J.
Tzionas, Dimitrios
2024 INTERNATIONAL CONFERENCE IN 3D VISION, 3DV 2024, 2024, : 85 - 95
[26] Generative estimation of 3D human pose using shape contexts matching
Zhao, Xu
Liu, Yuncai
COMPUTER VISION - ACCV 2007, PT I, PROCEEDINGS, 2007, 4843 : 419 - 429
[27] SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
Xu, Xiangyu
Liu, Lijuan
Yan, Shuicheng
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (05) : 3275 - 3289
[28] Reducing Depth Ambiguity in 3D Human Pose and Body Shape Estimation
Maruyama, Gakuto
Kaneko, Naoshi
Ito, Seiya
Sumi, Kazuhiko
FIFTEENTH INTERNATIONAL CONFERENCE ON QUALITY CONTROL BY ARTIFICIAL VISION, 2021, 11794
[29] EventHPE: Event-based 3D Human Pose and Shape Estimation
Zou, Shihao
Guo, Chuan
Zuo, Xinxin
Wang, Sen
Wang, Pengyu
Hu, Xiaoqin
Chen, Shoushun
Gong, Minglun
Cheng, Li
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 10976 - 10985
[30] Sequential 3D Human Pose and Shape Estimation from Point Clouds
Wang, Kangkan
Xie, Jin
Zhang, Guofeng
Liu, Lei
Yang, Jian
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 7273 - 7282

← 1 2 3 4 5 →