Explicit Spatiotemporal Joint Relation Learning for Tracking Human Pose

被引:6
|
作者
Sun, Xiao [1 ]
Li, Chuankang [2 ,3 ]
Lin, Stephen [1 ]
机构
[1] Microsoft Res, Redmond, WA 98052 USA
[2] Zhejiang Univ, Hangzhou, Peoples R China
[3] Microsoft Res Asia, Beijing, Peoples R China
关键词
D O I
10.1109/ICCVW.2019.00344
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a method for human pose tracking that is based on learning spatiotemporal relationships among joints. Beyond generating the heatmap of a joint in a given frame, our system also learns to predict the offset of the joint from a neighboring joint in the frame. Additionally, it is trained to predict the displacement of the joint from its position in the previous frame, in a manner that can account for possibly changing joint appearance, unlike optical flow. These relational cues in the spatial domain and temporal domain are inferred in a robust manner by attending only to relevant areas in the video frames. By explicitly learning and exploiting these joint relationships, our system achieves state-of-the-art performance on standard benchmarks for various pose tracking tasks including 3D body pose tracking in RGB video, 3D hand pose tracking in depth sequences, and 3D hand gesture tracking in RGB video.
引用
收藏
页码:2825 / 2835
页数:11
相关论文
共 50 条
  • [1] Joint relation based human pose estimation
    Liang, Shuang
    Chu, Gang
    Xie, Chi
    Wang, Jiewen
    [J]. VISUAL COMPUTER, 2022, 38 (04): : 1369 - 1381
  • [2] Joint relation based human pose estimation
    Shuang Liang
    Gang Chu
    Chi Xie
    Jiewen Wang
    [J]. The Visual Computer, 2022, 38 : 1369 - 1381
  • [3] Joint Segmentation and Pose Tracking of Human in Natural Videos
    Lim, Taegyu
    Hong, Seunghoon
    Han, Bohyung
    Han, Joon Hee
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, : 833 - 840
  • [4] Learning Joint Structure for Human Pose Estimation
    Feng, Shenming
    Hu, Haifeng
    [J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2020, 16 (03)
  • [5] Refining Joint Locations for Human Pose Tracking in Sports Videos
    Zecha, Dan
    Einfalt, Moritz
    Lienhart, Rainer
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, : 2524 - 2532
  • [6] Learning a Tracking and Estimation Integrated Graphical Model for Human Pose Tracking
    Zhao, Lin
    Gao, Xinbo
    Tao, Dacheng
    Li, Xuelong
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (12) : 3176 - 3186
  • [7] Spatiotemporal Learning Transformer for Video-Based Human Pose Estimation
    Gai, Di
    Feng, Runyang
    Min, Weidong
    Yang, Xiaosong
    Su, Pengxiang
    Wang, Qi
    Han, Qing
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 4564 - 4576
  • [8] Mutual Learning to Adapt for Joint Human Parsing and Pose Estimation
    Nie, Xuecheng
    Feng, Jiashi
    Yan, Shuicheng
    [J]. COMPUTER VISION - ECCV 2018, PT V, 2018, 11209 : 519 - 534
  • [9] Driver activity recognition by learning spatiotemporal features of pose and human object interaction
    Naveed, Humza
    Jafri, Fareed
    Javed, Kashif
    Babri, Haroon Atique
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 77
  • [10] Driver activity recognition by learning spatiotemporal features of pose and human object interaction
    Naveed, Humza
    Jafri, Fareed
    Javed, Kashif
    Babri, Haroon Atique
    [J]. Journal of Visual Communication and Image Representation, 2021, 77