Monocular 3D Human Pose Estimation by Predicting Depth on Joints

被引:79
|
作者
Nie, Bruce Xiaohan [1 ]
Wei, Ping [1 ,2 ]
Zhu, Song-Chun [1 ]
机构
[1] UCLA, Ctr Vis Cognit Learning & Auton, Los Angeles, CA 90024 USA
[2] Xi An Jiao Tong Univ, Xian, Shaanxi, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICCV.2017.373
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper aims at estimating full-body 3D human poses from monocular images of which the biggest challenge is the inherent ambiguity introduced by lifting the 2D pose into 3D space. We propose a novel framework focusing on reducing this ambiguity by predicting the depth of human joints based on 2D human joint locations and body part images. Our approach is built on a two-level hierarchy of Long Short-Term Memory (LSTM) Networks which can be trained end-to-end. The first level consists of two components: 1) a skeleton-LSTM which learns the depth information from global human skeleton features; 2) a patch-LSTM which utilizes the local image evidence around joint locations. The both networks have tree structure defined on the kinematic relation of human skeleton, thus the information at different joints is broadcast through the whole skeleton in a top-down fashion. The two networks are first pre-trained separately on different data sources and then aggregated in the second layer for final depth prediction. The empirical evaluation on Human3.6M and HHOI dataset demonstrates the advantage of combining global 2D skeleton and local image patches for depth prediction, and our superior quantitative and qualitative performance relative to state-of-theart methods.
引用
收藏
页码:3467 / 3475
页数:9
相关论文
共 50 条
  • [1] A survey on monocular 3D human pose estimation
    Ji, Xiaopeng
    Fang, Qi
    Dong, Junting
    Shuai, Qing
    Jiang, Wen
    Zhou, Xiaowei
    [J]. Virtual Reality and Intelligent Hardware, 2020, 2 (06): : 471 - 500
  • [2] MONOCULAR 3D HUMAN POSE ESTIMATION BY CLASSIFICATION
    Greif, Thomas
    Lienhart, Rainer
    Sengupta, Debabrata
    [J]. 2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [3] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Liu, Shuangjun
    Sehgal, Naveen
    Ostadabbas, Sarah
    [J]. APPLIED INTELLIGENCE, 2022, 52 (12) : 14491 - 14506
  • [4] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Shuangjun Liu
    Naveen Sehgal
    Sarah Ostadabbas
    [J]. Applied Intelligence, 2022, 52 : 14491 - 14506
  • [5] Generalizing Monocular 3D Human Pose Estimation in the Wild
    Wang, Luyang
    Chen, Yan
    Guo, Zhenhua
    Qian, Keyuan
    Lin, Mude
    Li, Hongsheng
    Ren, Jimmy S.
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4024 - 4033
  • [6] 3D human pose estimation by depth map
    Wu, Jianzhai
    Hu, Dewen
    Xiang, Fengtao
    Yuan, Xingsheng
    Su, Jiongming
    [J]. VISUAL COMPUTER, 2020, 36 (07): : 1401 - 1410
  • [7] 3D human pose estimation by depth map
    Jianzhai Wu
    Dewen Hu
    Fengtao Xiang
    Xingsheng Yuan
    Jiongming Su
    [J]. The Visual Computer, 2020, 36 : 1401 - 1410
  • [8] Synthetic Depth Transfer for Monocular 3D Object Pose Estimation in the Wild
    Kao, Yueying
    Li, Weiming
    Wang, Qiang
    Lin, Zhouchen
    Kim, Wooshik
    Hong, Sunghoon
    [J]. THIRTY-FOURTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THE THIRTY-SECOND INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE AND THE TENTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2020, 34 : 11221 - 11228
  • [9] Monocular 3D Human Pose Estimation by Generation and Ordinal Ranking
    Sharma, Saurabh
    Varigonda, Pavan Teja
    Bindal, Prashast
    Sharma, Abhishek
    Jain, Arjun
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2325 - 2334
  • [10] Double chain networks for monocular 3D human pose estimation
    Bai, Guihu
    Luo, Yanmin
    Pan, Xueliang
    Wang, Youjie
    Wang, Jia
    Guo, Jingming
    [J]. IMAGE AND VISION COMPUTING, 2022, 123