VTP: volumetric transformer for multi-view multi-person 3D pose estimation

被引:0
|
作者
Yuxing Chen
Renshu Gu
Ouhan Huang
Gangyong Jia
机构
[1] Hangzhou Dianzi University,The School of Computer Science and Technology
[2] Fudan University,Key Laboratory for Information Science of Electromagnetic Waves (MoE)
来源
Applied Intelligence | 2023年 / 53卷
关键词
3D human pose estimation; Sinkhorn transformer; Multi-person pose estimation; Volumetric representation; Multi-view pose estimation; Sparse sinkhorn attention;
D O I
暂无
中图分类号
学科分类号
摘要
This paper presents Volumetric Transformer Pose Estimator (VTP), the first 3D volumetric transformer framework for multi-view multi-person 3D human pose estimation. VTP aggregates features from 2D keypoints in all camera views and directly learns the spatial relationships in the 3D voxel space in an end-to-end fashion. The aggregated 3D features are passed through 3D convolutions before being flattened into sequential embeddings and fed into a transformer. A residual structure is designed to further improve the performance. In addition, the sparse Sinkhorn attention is empowered to reduce the memory cost, which is a major bottleneck for volumetric representations, while also achieving excellent performance. The output of the transformer is again concatenated with 3D convolutional features by a residual design. The proposed VTP framework integrates the high performance of the transformer with volumetric representations, which can be used as a good alternative to the convolutional backbones. Experiments on the Shelf, Campus and CMU Panoptic benchmarks show promising results in terms of both Mean Per Joint Position Error (MPJPE) and Percentage of Correctly estimated Parts (PCP). Our code will be available.
引用
收藏
页码:26568 / 26579
页数:11
相关论文
共 50 条
  • [41] Single-shot 3D multi-person pose estimation in complex images
    Benzine, Abdallah
    Luvison, Bertrand
    Pham, Quoc Cuong
    Achard, Catherine
    PATTERN RECOGNITION, 2021, 112
  • [42] CRENet: Crowd region enhancement network for multi-person 3D pose estimation
    Li, Zhaokun
    Liu, Qiong
    IMAGE AND VISION COMPUTING, 2024, 151
  • [43] Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution
    Gu, Renshu
    Wang, Gaoang
    Hwang, Jenq-Neng
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8243 - 8250
  • [44] MMDA: Multi-person marginal distribution awareness for monocular 3D pose estimation
    Liu, Sheng
    Shuai, Jianghai
    Li, Yang
    Du, Sidan
    IET IMAGE PROCESSING, 2023, 17 (07) : 2182 - 2191
  • [45] Single-Stage is Enough: Multi-Person Absolute 3D Pose Estimation
    Jin, Lei
    Xu, Chenyang
    Wang, Xiaojuan
    Xiao, Yabo
    Guo, Yandong
    Nie, Xuecheng
    Zhao, Jian
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 13076 - 13085
  • [46] Multi-person Absolute 3D Human Pose Estimation with Weak Depth Supervision
    Veges, Marton
    Lorincz, Andras
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 258 - 270
  • [47] Unsupervised universal hierarchical multi-person 3D pose estimation for natural scenes
    Renshu Gu
    Zhongyu Jiang
    Gaoang Wang
    Kevin McQuade
    Jenq-Neng Hwang
    Multimedia Tools and Applications, 2022, 81 : 32883 - 32906
  • [48] RPM 2.0: RF-Based Pose Machines for Multi-Person 3D Pose Estimation
    Xie, Chunyang
    Zhang, Dongheng
    Wu, Zhi
    Yu, Cong
    Hu, Yang
    Chen, Yan
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (01) : 490 - 503
  • [49] Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement
    Cha, Junuk
    Saqlain, Muhammad
    Kim, GeonU
    Shin, Mingyu
    Baek, Seungryul
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 660 - 677
  • [50] Unsupervised universal hierarchical multi-person 3D pose estimation for natural scenes
    Gu, Renshu
    Jiang, Zhongyu
    Wang, Gaoang
    McQuade, Kevin
    Hwang, Jenq-Neng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (23) : 32883 - 32906