CapsulePose: A variational CapsNet for real-time end-to-end 3D human pose estimation

被引:6
|
作者
Garau, Nicola [1 ]
Conci, Nicola [1 ]
机构
[1] Univ Trento, Via Sommar 9, I-38123 Trento, Italy
关键词
Capsule networks; 3D human pose estimation; Viewpoint-equivariance; Deep learning; Real-time; RECOGNITION;
D O I
10.1016/j.neucom.2022.11.097
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Estimating 3D human poses from images is an ill-posed regression problem, which is usually tackled by viewpoint-invariant convolutional neural networks (CNNs). Recently, capsule networks (CapsNets) have been introduced as a viable alternative to CNNs, ensuring viewpoint-equivariance and drastically reducing both the dataset size and the network complexity, while retaining high output accuracy. We propose a real-time end-to-end human pose estimation (HPE) network which employs state-of-the-art matrix capsules [1] and a fast variational Bayesian capsule routing, without relying on pre-training, complex data augmentation or multiple datasets. We achieve comparable results to the HPE state-of-the-art, and the lowest error among methods using CapsNets, while at the same time achieving other desirable properties, namely greater generalization capabilities, stronger viewpoint equivariance and highly decreased data dependency, allowing for our network to be trained with only a fraction of the available datasets and without any data augmentation.
引用
收藏
页码:81 / 91
页数:11
相关论文
共 50 条
  • [21] PSVT: End-to-End Multi-person 3D Pose and Shape Estimation with Progressive Video Transformers
    Qiu, Zhongwei
    Yang, Qiansheng
    Wang, Jian
    Feng, Haocheng
    Han, Junyu
    Ding, Errui
    Xu, Chang
    Fu, Dongmei
    Wang, Jingdong
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21254 - 21263
  • [22] FLK: A filter with learned kinematics for real-time 3D human pose estimation
    Martini, Enrico
    Boldo, Michele
    Bombieri, Nicola
    SIGNAL PROCESSING, 2024, 224
  • [23] Real-time 3D human pose estimation without skeletal a priori structures
    Bai, Guihu
    Luo, Yanmin
    Pan, Xueliang
    Wang, Jia
    Guo, Jing-Ming
    IMAGE AND VISION COMPUTING, 2023, 132
  • [24] Faster VoxelPose: Real-time 3D Human Pose Estimation by Orthographic Projection
    Ye, Hang
    Zhu, Wentao
    Wang, Chunyu
    Wu, Rujie
    Wang, Yizhou
    COMPUTER VISION - ECCV 2022, PT VI, 2022, 13666 : 142 - 159
  • [25] Achieving Hard Real-Time Capability for 3D Human Pose Estimation Systems
    Schlosser, Patrick
    Ledermann, Christoph
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 3772 - 3778
  • [26] Deep learning-based real-time 3D human pose estimation
    Zhang, Xiaoyan
    Zhou, Zhengchun
    Han, Ying
    Meng, Hua
    Yang, Meng
    Rajasegarar, Sutharshan
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 119
  • [27] MobileHumanPose: Toward real-time 3D human pose estimation in mobile devices
    Choi, Sangbum
    Choi, Seokeon
    Kim, Changick
    2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, : 2328 - 2338
  • [28] VNect: Real-time 3D Human Pose Estimation with a Single RGB Camera
    Mehta, Dushyant
    Sridhar, Srinath
    Sotnychenko, Oleksandr
    Rhodin, Helge
    Shafiei, Mohammad
    Seidel, Hans-Peter
    Xu, Weipeng
    Casas, Dan
    Theobalt, Christian
    ACM TRANSACTIONS ON GRAPHICS, 2017, 36 (04):
  • [29] SST: Real-time End-to-end Monocular 3D Reconstruction via Sparse Spatial-Temporal Guidance
    Zhang, Chenyangguang
    Lou, Zhiqiang
    Di, Yan
    Tombari, Federico
    Ji, Xiangyang
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2033 - 2038
  • [30] LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging
    Ge, Haoyang
    Feng, Qiao
    Jia, Hailong
    Li, Xiongzheng
    Yin, Xiangjun
    Zhou, You
    Yang, Jingyu
    Li, Kun
    2024 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2024, 2024, : 1471 - 1480