Learning 3D Human Pose Estimation from Dozens of Datasets using a Geometry-Aware Autoencoder to Bridge Between Skeleton Formats

被引:8
|
作者
Sarandi, Istvan [1 ]
Hermans, Alexander [1 ]
Leibe, Bastian [1 ]
机构
[1] Rhein Westfal TH Aachen, Aachen, Germany
关键词
D O I
10.1109/WACV56688.2023.00297
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning-based 3D human pose estimation performs best when trained on large amounts of labeled data, making combined learning from many datasets an important research direction. One obstacle to this endeavor are the different skeleton formats provided by different datasets, i.e., they do not label the same set of anatomical landmarks. There is little prior research on how to best supervise one model with such discrepant labels. We show that simply using separate output heads for different skeletons results in inconsistent depth estimates and insufficient information sharing across skeletons. As a remedy, we propose a novel affine-combining autoencoder (ACAE) method to perform dimensionality reduction on the number of landmarks. The discovered latent 3D points capture the redundancy among skeletons, enabling enhanced information sharing when used for consistency regularization. Our approach scales to an extreme multi-dataset regime, where we use 28 3D human pose datasets to supervise one model, which outperforms prior work on a range of benchmarks, including the challenging 3D Poses in the Wild (3DPW) dataset. Our code and models are available for research purposes.(1)
引用
收藏
页码:2955 / 2965
页数:11
相关论文
共 50 条
  • [41] Adversarial learning for viewpoints invariant 3D human pose estimation
    Li, Yimeng
    Xiao, Jun
    Xie, Di
    Shao, Jian
    Wang, Jinlong
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2019, 58 : 374 - 379
  • [42] ADVERSARIAL LEARNING ENHANCEMENT FOR 3D HUMAN POSE AND SHAPE ESTIMATION
    Sun, Yidian
    Zhang, Jiwei
    Wang, Wendong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 3743 - 3747
  • [43] LEARNING MONOCULAR 3D HUMAN POSE ESTIMATION WITH SKELETAL INTERPOLATION
    Chen, Ziyi
    Sugimoto, Akihiro
    Lai, Shang-Hong
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4218 - 4222
  • [44] Discriminative learning of visual words for 3D human pose estimation
    Ning, Huazhong
    Xu, Wei
    Gong, Yihong
    Huang, Thomas
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 1491 - +
  • [45] Context-Aware Network for 3D Human Pose Estimation from Monocular RGB Image
    Yin, Binyi
    Zhang, Dongbo
    Li, Shuai
    Hao, Aimin
    Qin, Hong
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [46] Monocular Estimation of Translation, Pose and 3D Shape on Detected Objects using a Convolutional Autoencoder
    Persson, Ivar
    Ahrnbom, Martin
    Nilsson, Mikael
    PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 390 - 396
  • [47] 3D Human Pose Estimation via Deep Learning from 2D annotations
    Brau, Ernesto
    Jiang, Hao
    PROCEEDINGS OF 2016 FOURTH INTERNATIONAL CONFERENCE ON 3D VISION (3DV), 2016, : 582 - 591
  • [48] 3D Object Pose Estimation Using Viewpoint Generative Learning
    Thachasongtham, Dissaphong
    Yoshida, Takumi
    de Sorbier, Francois
    Saito, Hideo
    IMAGE ANALYSIS, SCIA 2013: 18TH SCANDINAVIAN CONFERENCE, 2013, 7944 : 512 - 521
  • [49] Self-Supervised Learning of 3D Human Pose using Multi-view Geometry
    Kocabas, Muhammed
    Karagoz, Salih
    Akbas, Emre
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1077 - 1086
  • [50] Geometry-driven self-supervision for 3D human pose estimation
    Yang, Geon-Jun
    Kim, Jun-Hee
    Lee, Seong-Whan
    NEURAL NETWORKS, 2024, 174