Learning a Robust Part-Aware Monocular 3D Human Pose Estimator via Neural Architecture Search

被引:2
|
作者
Chen, Zerui [1 ,2 ]
Huang, Yan [1 ]
Yu, Hongyuan [1 ,2 ]
Wang, Liang [1 ,2 ,3 ,4 ]
机构
[1] CASIA, Ctr Res Intelligent Percept & Comp, NLPR, Beijing, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing, Peoples R China
[3] Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing, Peoples R China
[4] Chinese Acad Sci, Artificial Intelligence Res, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
Monocular 3D human pose estimation; Heterogeneous human body parts; Neural architecture search;
D O I
10.1007/s11263-021-01525-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Even though most existing monocular 3D human pose estimation methods achieve very competitive performance, they are limited in estimating heterogeneous human body parts with the same decoder architecture. In this work, we present an approach to build a part-aware 3D human pose estimator to better deal with these heterogeneous human body parts. Our proposed method consists of two learning stages: (1) searching suitable decoder architectures for specific parts and (2) training the part-aware 3D human pose estimator built with these optimized neural architectures. Consequently, our searched model is very efficient and compact and can automatically select a suitable decoder architecture to estimate each human body part. In comparison with previous state-of-the-art models built with ResNet-50 network, our method can achieve better performance and reduce 64.4% parameters and 8.5% FLOPs (multiply-adds). We validate the robustness and stability of our searched models by conducting extensive and rigorous ablation experiments. Our method can advance state-of-the-art accuracy on both the single-person and multi-person 3D human pose estimation benchmarks with affordable computational cost.
引用
收藏
页码:56 / 75
页数:20
相关论文
共 50 条
  • [21] MONOCULAR 3D HUMAN POSE ESTIMATION BY CLASSIFICATION
    Greif, Thomas
    Lienhart, Rainer
    Sengupta, Debabrata
    2011 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2011,
  • [22] Monocular 3D Pose Estimation via Pose Grammar and Data Augmentation
    Xu, Yuanlu
    Wang, Wenguan
    Liu, Tengyu
    Liu, Xiaobai
    Xie, Jianwen
    Zhu, Song-Chun
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (10) : 6327 - 6344
  • [23] Part-Aware Data Augmentation for 3D Object Detection in Point Cloud
    Choi, Jaeseok
    Song, Yeji
    Kwak, Nojun
    2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 3391 - 3397
  • [24] Improving 3D Human Pose Estimation via 3D Part Affinity Fields
    Liu, Ding
    Zhao, Zixu
    Wang, Xinchao
    Hu, Yuxiao
    Zhang, Lei
    Huang, Thomas S.
    2019 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2019, : 1004 - 1013
  • [25] Modeling vs. Learning Approaches for Monocular 3D Human Pose Estimation
    Gong, Wenjuan
    Brauer, Juergen
    Arens, Michael
    Gonzalez, Jordi
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [26] Learning with privileged stereo knowledge for monocular absolute 3D human pose estimation
    Bian, Cunling
    Lu, Weigang
    Feng, Wei
    Wang, Song
    PATTERN RECOGNITION LETTERS, 2025, 189 : 143 - 149
  • [27] LASOR: Learning Accurate 3D Human Pose and Shape via Synthetic Occlusion-Aware Data and Neural Mesh Rendering
    Yang, Kaibing
    Gu, Renshu
    Wang, Maoyu
    Toyoura, Masahiro
    Xu, Gang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 1938 - 1948
  • [28] Generalizing Monocular 3D Human Pose Estimation in the Wild
    Wang, Luyang
    Chen, Yan
    Guo, Zhenhua
    Qian, Keyuan
    Lin, Mude
    Li, Hongsheng
    Ren, Jimmy S.
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4024 - 4033
  • [29] Recovering 3D human pose from monocular images
    Agarwal, A
    Triggs, B
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2006, 28 (01) : 44 - 58
  • [30] Context-Aware Network for 3D Human Pose Estimation from Monocular RGB Image
    Yin, Binyi
    Zhang, Dongbo
    Li, Shuai
    Hao, Aimin
    Qin, Hong
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,