3D Human Mesh Reconstruction by Learning to Sample Joint Adaptive Tokens for Transformers

被引：6

作者：

Xue, Youze ^{[1
]}

Chen, Jiansheng ^{[2
]}

Zhang, Yudong ^{[1
]}

Yu, Cheng ^{[1
]}

Ma, Huimin ^{[2
]}

Ma, Hongbing ^{[1
]}

机构：

[1] Tsinghua Univ, Beijing, Peoples R China

[2] Univ Sci & Technol, Beijing, Peoples R China

来源：

PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022 | 2022年

基金：

中国国家自然科学基金;

关键词：

3D human pose estimation; vision transformers; learnable sampling;

D O I：

10.1145/3503161.3548133

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Reconstructing 3D human mesh from a single RGB image is a challenging task due to the inherent depth ambiguity. Researchers commonly use convolutional neural networks to extract features and then apply spatial aggregation on the feature maps to explore the embedded 3D cues in the 2D image. Recently, two methods of spatial aggregation, the transformers and the spatial attention, are adopted to achieve the state-of-the-art performance, whereas they both have limitations. The use of transformers helps modelling long-term dependency across different joints whereas the grid tokens are not adaptive for the positions and shapes of human joints in different images. On the contrary, the spatial attention focuses on joint-specific features. However, the non-local information of the body is ignored by the concentrated attention maps. To address these issues, we propose a Learnable Sampling module to generate joint adaptive tokens and then use transformers to aggregate global information. Feature vectors are sampled accordingly from the feature maps to form the tokens of different joints. The sampling weights are predicted by a learnable network so that the model can learn to sample joint-related features adaptively. Our adaptive tokens are explicitly correlated with human joints, so that more effective modeling of global dependency among different human joints can be achieved. To validate the effectiveness of our method, we conduct experiments on several popular datasets including Human3.6M and 3DPW. Our method achieves lower reconstruction errors in terms of both the vertex-based metric and the joint-based metric compared to previous state of the arts. The codes and the trained models are released at https://github.com/thuxyz19/Learnable-Sampling.

引用

页码：6765 / 6773

页数：9

共 50 条

[1] JOTR: 3D Joint Contrastive Learning with Transformers for Occluded Human Mesh Recovery
Li, Jiahao
Yang, Zongxin
Wang, Xiaohan
Ma, Jianxin
Zhou, Chang
Yang, Yi
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 9076 - 9087
[2] Multimodal Token Fusion and Optimization for 3D Human Mesh Reconstruction with Transformers
Jiang, Yang
Wang, Sunli
Sun, Mingyang
Kou, Dongliang
Xie, Qiangbin
Zhang, Lihuang
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2024, PT VI, 2025, 15036 : 593 - 605
[3] Adaptive mesh generation of MRI images for 3D reconstruction of human trunk
Courchesne, O.
Guibault, F.
Dompierre, J.
Cheriet, F.
IMAGE ANALYSIS AND RECOGNITION, PROCEEDINGS, 2007, 4633 : 1040 - +
[4] Joint Reconstruction of Image and Motion in MRI: Implicit Regularization Using an Adaptive 3D Mesh
Menini, Anne
Vuissoz, Pierre-Andre
Felblinger, Jacques
Odille, Freddy
MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2012, PT I, 2012, 7510 : 264 - 271
[5] Adaptive Joint Optimization for 3D Reconstruction With Differentiable Rendering
Zhang, Jingbo
Wan, Ziyu
Liao, Jing
IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (06) : 3039 - 3051
[6] Learning Human Mesh Recovery in 3D Scenes
Shen, Zehong
Cen, Zhi
Peng, Sida
Shuai, Qing
Bao, Hujun
Zhou, Xiaowei
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 17038 - 17047
[7] Accurate 3D Face Reconstruction with Facial Component Tokens
Zhang, Tianke
Chu, Xuangeng
Liu, Yunfei
Lin, Lijian
Yang, Zhendong
Xu, Zhengzhuo
Cao, Chengkun
Yu, Fei
Zhou, Changyin
Yuan, Chun
Li, Yu
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8999 - 9008
[8] 3D adaptive mesh refinement
Merrouche, A
Selman, A
Knopf-Lenoir, C
COMMUNICATIONS IN NUMERICAL METHODS IN ENGINEERING, 1998, 14 (05): : 397 - 407
[9] An adaptive mesh model for 3D reconstruction from unorganized data points
Hu, WQ
Yang, WY
Xiong, YL
INTERNATIONAL JOURNAL OF ADVANCED MANUFACTURING TECHNOLOGY, 2005, 26 (11-12): : 1362 - 1369
[10] An adaptive mesh model for 3D reconstruction from unorganized data points
W. Hu
W. Yang
Y. Xiong
The International Journal of Advanced Manufacturing Technology, 2005, 26 : 1362 - 1369

← 1 2 3 4 5 →