TEMPO: Efficient Multi-View Pose Estimation, Tracking, and Forecasting

被引:1
|
作者
Choudhury, Rohan [1 ]
Kitani, Kris M. [1 ]
Jeni, Laszlo A. [1 ]
机构
[1] Carnegie Mellon Univ, Robot Inst, Pittsburgh, PA 15213 USA
关键词
D O I
10.1109/ICCV51070.2023.01355
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing volumetric methods for predicting 3D human pose estimation are accurate, but computationally expensive and optimized for single time-step prediction. We present TEMPO, an efficient multi-view pose estimation model that learns a robust spatiotemporal representation, improving pose accuracy while also tracking and forecasting human pose. We significantly reduce computation compared to the state-of-the-art by recurrently computing per-person 2D pose features, fusing both spatial and temporal information into a single representation. In doing so, our model is able to use spatiotemporal context to predict more accurate human poses without sacrificing efficiency. We further use this representation to track human poses over time as well as predict future poses. Finally, we demonstrate that our model is able to generalize across datasets without scene-specific fine-tuning. TEMPO achieves 10% better MPJPE with a 33x improvement in FPS compared to TesseTrack on the challenging CMU Panoptic Studio dataset. Our code and demos are available at https://rccchoudhury.github.io/tempo2023/.
引用
收藏
页码:14704 / 14714
页数:11
相关论文
共 50 条
  • [1] Efficient Multi-View Object Recognition and Full Pose Estimation
    Collet, Alvaro
    Srinivasa, Siddhartha S.
    [J]. 2010 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2010, : 2050 - 2055
  • [2] Simultaneous Multi-View Camera Pose Estimation and Object Tracking With Squared Planar Markers
    Sarmadi, Hamid
    Munoz-Salinas, Rafael
    Berbis, M. A.
    Medina-Carnicer, R.
    [J]. IEEE ACCESS, 2019, 7 : 22927 - 22940
  • [3] Epipolar Transformer for Multi-view Human Pose Estimation
    He, Yihui
    Yan, Rui
    Fragkiadaki, Katerina
    Yu, Shoou-, I
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2020), 2020, : 4466 - 4471
  • [4] JOINT MULTI-VIEW PEOPLE TRACKING AND POSE ESTIMATION FOR 3D SCENE RECONSTRUCTION
    Tang, Zheng
    Gu, Renshu
    Hwang, Jenq-Neng
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [5] Deep NRSFM for multi-view multi-body pose estimation
    Fothi, Aron
    Skaf, Joul
    Lu, Fengjiao
    Fenech, Kristian
    [J]. PATTERN RECOGNITION LETTERS, 2024, 185 : 218 - 224
  • [6] Multi-view head pose estimation using neural networks
    Voit, M
    Nickel, K
    Stiefelhagen, R
    [J]. 2ND CANADIAN CONFERENCE ON COMPUTER AND ROBOT VISION, PROCEEDINGS, 2005, : 347 - 352
  • [7] Automatic multi-view pose estimation in focused cardiac ultrasound
    Freitas, Joao
    Gomes-Fonseca, Joao
    Tonelli, Ana Claudia
    Correia-Pinto, Jorge
    Fonseca, Jaime C.
    Queiros, Sandro
    [J]. MEDICAL IMAGE ANALYSIS, 2024, 94
  • [8] Multi-view Pose Estimation with Flexible Mixtures-of-Parts
    Dogan, Emre
    Eren, Gonen
    Wolf, Christian
    Lombardi, Eric
    Baskurt, Atilla
    [J]. ADVANCED CONCEPTS FOR INTELLIGENT VISION SYSTEMS (ACIVS 2017), 2017, 10617 : 180 - 190
  • [9] Human Pose Estimation through a Novel Multi-view Scheme
    Charco, Jorge L.
    Sappa, Angel D.
    Vintimilla, Boris X.
    [J]. PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 855 - 862
  • [10] An Automatic System for Multi-View Face Detection and Pose Estimation
    Ying, Ying
    Wang, Han
    Xu, Jian
    [J]. 11TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION (ICARCV 2010), 2010, : 1101 - 1108