Enhancing egocentric 3D pose estimation with third person views

被引:2
|
作者
Dhamanaskar, Ameya [1 ]
Dimiccoli, Mariella [1 ]
Corona, Enric [1 ]
Pumarola, Albert [1 ]
Moreno-Noguer, Francesc [1 ]
机构
[1] UPC, CSIC, Inst Robot & Informat Ind, Carrer Llorens & Artigas 4-6, Barcelona 08028, Spain
关键词
3D pose estimation; Self -supervised learning; Egocentric vision;
D O I
10.1016/j.patcog.2023.109358
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose a novel approach to enhance the 3D body pose estimation of a person computed from videos captured from a single wearable camera. The main technical contribution consists of leveraging high-level features linking first-and third-views in a joint embedding space. To learn such embedding space we introduce First2Third-Pose, a new paired synchronized dataset of nearly 20 0 0 videos depicting human activities captured from both first-and third-view perspectives. We explicitly consider spatial -and motion-domain features, combined using a semi-Siamese architecture trained in a self-supervised fashion. Experimental results demonstrate that the joint multi-view embedded space learned with our dataset is useful to extract discriminatory features from arbitrary single-view egocentric videos, with no need to perform any sort of domain adaptation or knowledge of camera parameters. An extensive evalu-ation demonstrates that we achieve significant improvement in egocentric 3D body pose estimation per-formance on two unconstrained datasets, over three supervised state-of-the-art approaches. The collected dataset and pre-trained model are available for research purposes.1 (c) 2023 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license ( http://creativecommons.org/licenses/by-nc-nd/4.0/ )
引用
收藏
页数:11
相关论文
共 50 条
  • [41] Hierarchical Temporal Transformer for 3D Hand Pose Estimation and Action Recognition from Egocentric RGB Videos
    Wen, Yilin
    Pan, Hao
    Yang, Lei
    Pan, Jia
    Komura, Taku
    Wang, Wenping
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 21243 - 21253
  • [42] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Liu, Shuangjun
    Sehgal, Naveen
    Ostadabbas, Sarah
    APPLIED INTELLIGENCE, 2022, 52 (12) : 14491 - 14506
  • [43] Adapted human pose: monocular 3D human pose estimation with zero real 3D pose data
    Shuangjun Liu
    Naveen Sehgal
    Sarah Ostadabbas
    Applied Intelligence, 2022, 52 : 14491 - 14506
  • [44] 3D Human Pose Estimation=2D Pose Estimation plus Matching
    Chen, Ching-Hang
    Ramanan, Deva
    30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 5759 - 5767
  • [45] RF-based Multi-view Pose Machine for Multi-Person 3D Pose Estimation
    Xie, Chunyang
    Zhang, Dongheng
    Wu, Zhi
    Yu, Cong
    Hu, Yang
    Sun, Qibin
    Chen, Yan
    2023 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME, 2023, : 2669 - 2674
  • [46] CRENet: Crowd region enhancement network for multi-person 3D pose estimation
    Li, Zhaokun
    Liu, Qiong
    IMAGE AND VISION COMPUTING, 2024, 151
  • [47] Multi-Person 3D Pose and Shape Estimation via Inverse Kinematics and Refinement
    Cha, Junuk
    Saqlain, Muhammad
    Kim, GeonU
    Shin, Mingyu
    Baek, Seungryul
    COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 660 - 677
  • [48] Unsupervised universal hierarchical multi-person 3D pose estimation for natural scenes
    Gu, Renshu
    Jiang, Zhongyu
    Wang, Gaoang
    McQuade, Kevin
    Hwang, Jenq-Neng
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (23) : 32883 - 32906
  • [49] Exploring Severe Occlusion: Multi-Person 3D Pose Estimation with Gated Convolution
    Gu, Renshu
    Wang, Gaoang
    Hwang, Jenq-Neng
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8243 - 8250
  • [50] Multi-person Absolute 3D Human Pose Estimation with Weak Depth Supervision
    Veges, Marton
    Lorincz, Andras
    ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT I, 2020, 12396 : 258 - 270