A Spatio-Temporal CRF for Human Interaction Understanding

被引:30
|
作者
Wang, Zhenhua [1 ]
Liu, Sheng [2 ]
Zhang, Jianhua [3 ]
Chen, Shengyong [2 ,4 ]
Guan, Qiu [2 ]
机构
[1] Zhejiang Univ Technol, Sch Comp Sci, Hangzhou 310014, Zhejiang, Peoples R China
[2] Zhejiang Univ Technol, Dept Comp Sci, Hangzhou 310014, Zhejiang, Peoples R China
[3] Zhejiang Univ Technol, Coll Comp Sci, Hangzhou 310014, Zhejiang, Peoples R China
[4] Tianjin Univ Technol, Tianjin 300384, Peoples R China
基金
中国国家自然科学基金;
关键词
Conditional random fields (CRFs); human action recognition (HAR); interaction; video understanding; ACTION RECOGNITION;
D O I
10.1109/TCSVT.2016.2539699
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
A better understanding of human interactions in videos can be achieved by simultaneously considering the coarse interactions between people, the action of each individual, and the activity of all people as a whole. We divide the recognition task into two stages. The first stage discriminates interactions and noninteractions, actions and activities based on local image information, while during the second stage, actions and activities are recognized in a global manner based on the local recognition results. A conditional random field (CRF) is designed to model human interactions in the spatio-temporal space. Different from most existing global models which cover either action or activity variables only, our model covers them both by considering the interactions between different types of variables. The graph structure of the CRF is predicted by a model learned from training data, which is different from traditional graph construction methods that typically rely on human heuristics. We learn the parameters of the CRF via structured support vector machine. We propose an efficient inference algorithm to tackle the estimation of labels in long videos containing many people. Our model admits both semantic-level understanding of human interactions in videos and competitive action and activity recognition performance.
引用
收藏
页码:1647 / 1660
页数:14
相关论文
共 50 条
  • [31] A Hybrid Method for Human Interaction Recognition using Spatio-Temporal Interest Points
    Li, Nijun
    Cheng, Xu
    Guo, Haiyan
    Wu, Zhenyang
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 2513 - 2518
  • [32] High Resolution Image Classification Based on Spatio-Temporal Context Model of CRF
    Zhang, Aiying
    Tang, Ping
    IGARSS 2018 - 2018 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2018, : 6979 - 6982
  • [33] Spatio-Temporal Prediction of Suspect Location by Spatio-Temporal Semantics
    Duan L.
    Hu T.
    Zhu X.
    Ye X.
    Wang S.
    Wuhan Daxue Xuebao (Xinxi Kexue Ban)/Geomatics and Information Science of Wuhan University, 2019, 44 (05): : 765 - 770
  • [34] Human Hand Gesture Recognition Using Spatio-Temporal Volumes for Human-computer Interaction
    Vafadar, Maryam
    Behrad, Afireza
    2008 INTERNATIONAL SYMPOSIUM ON TELECOMMUNICATIONS, VOLS 1 AND 2, 2008, : 713 - 718
  • [35] Spatio-Temporal Shape Parameterization of the Human Ventricles
    Szilagyi, Sandor M.
    ACTA POLYTECHNICA HUNGARICA, 2015, 12 (03) : 59 - 72
  • [36] Reconstruction of the spatio-temporal dynamics of a human magnetoencephalogram
    Jirsa, VK
    Friedrich, R
    Haken, H
    PHYSICA D, 1995, 89 (1-2): : 100 - 122
  • [37] Spatio-temporal Analysis of Human Mortality in Canada
    Cupido, Kyran
    McClure, Olivia
    CANADIAN STUDIES IN POPULATION, 2022, 49 (3-4) : 183 - 198
  • [38] Reconstruction of the spatio-temporal dynamics of a human magnetoencephalogram
    Jirsa, V.K.
    Friedrich, R.
    Haken, H.
    Physica D: Nonlinear Phenomena, 1995, 89 (1-2):
  • [39] SPATIO-TEMPORAL INTERACTION BETWEEN VISUAL COLOR MECHANISMS
    FOSTER, DH
    IDRIS, IIM
    VISION RESEARCH, 1974, 14 (01) : 35 - 39
  • [40] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,