Hierarchical Spatio-Temporal Context Modeling for Action Recognition

被引:0
|
作者
Sun, Ju [1 ]
Wu, Xiao [2 ]
Yan, Shuicheng [3 ]
Cheong, Loong-Fah [3 ]
Chua, Tat-Seng [4 ]
Li, Jintao [2 ]
机构
[1] Natl Univ Singapore, Interact & Digital Media Inst, Singapore 117548, Singapore
[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100864, Peoples R China
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117548, Singapore
[4] Natl Univ Singapore, Sch Comp, Singapore 117548, Singapore
基金
新加坡国家研究基金会;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The problem of recognizing actions in realistic videos is challenging yet absorbing owing to its great potentials in many practical applications. Most previous research is limited due to the use of simplified action databases under controlled environments or focus on excessively localized features without sufficiently encapsulating the spatio-temporal context. in this paper we propose to model the spatio-temporal context information in a hierarchical way, where three levels of context are exploited in ascending order of abstraction: 1) point-level context (SIFT average descriptor), 2) intra-trajectory context (trajectory transition descriptor), and 3) inter-trajectory context (trajectory proximity descriptor). To obtain efficient and compact representations for the latter two levels, we encode the spatio-temporal context information into the transition matrix of a Markov process, and then extract its stationary distribution as the final context descriptor Building on the multi-channel nonlinear SVMs, we validate this proposed hierarchical framework on the realistic action (HOHA) and event ( LSCOM) recognition databases, and achieve 27% and 66% relative performance improvements over the state-op the-art results, respectively. We further propose to employ the Multiple Kernel Learning (MKL) technique to prune the kernels towards speedup in algorithm evaluation.
引用
收藏
页码:2004 / +
页数:2
相关论文
共 50 条
  • [1] Projection transform on spatio-temporal context for action recognition
    Wanru Xu
    Zhenjiang Miao
    Qiang Zhang
    Multimedia Tools and Applications, 2015, 74 : 7711 - 7728
  • [2] Projection transform on spatio-temporal context for action recognition
    Xu, Wanru
    Miao, Zhenjiang
    Zhang, Qiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (18) : 7711 - 7728
  • [3] Hierarchical and Spatio-Temporal Sparse Representation for Human Action Recognition
    Tian, Yi
    Kong, Yu
    Ruan, Qiuqi
    An, Gaoyun
    Fu, Yun
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (04) : 1748 - 1762
  • [4] Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatio-Temporal Graph Convolutional Network for Action Recognition
    Papadopoulos, Konstantinos
    Ghorbel, Enjie
    Aouada, Djamila
    Ottersten, Bjoern
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 452 - 458
  • [5] Spatio-Temporal Motion Field Descriptors for The Hierarchical Action Recognition System
    Bao, Ruihan
    Shibata, Tadashi
    5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, ICSPCS'2011, 2011,
  • [6] Interest Point Selection with Spatio-Temporal Context for Realistic Action Recognition
    Shan, Yanhu
    Zhang, Zhang
    Zhang, Junge
    Huang, Kaiqi
    Wu, Na
    Hyun, Oh Se
    2012 IEEE NINTH INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL-BASED SURVEILLANCE (AVSS), 2012, : 94 - 99
  • [7] Modeling spatio-temporal layout with Lie Algebrized Gaussians for action recognition
    Chen, Meng
    Gong, Liyu
    Wang, Tianjiang
    Liu, Fang
    Feng, Qi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (17) : 10335 - 10355
  • [8] Modeling spatio-temporal layout with Lie Algebrized Gaussians for action recognition
    Meng Chen
    Liyu Gong
    Tianjiang Wang
    Fang Liu
    Qi Feng
    Multimedia Tools and Applications, 2016, 75 : 10335 - 10355
  • [9] Spatio-temporal Relation Modeling for Few-shot Action Recognition
    Thatipelli, Anirudh
    Narayan, Sanath
    Khan, Salman
    Anwer, Rao Muhammad
    Khan, Fahad Shahbaz
    Ghanem, Bernard
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935
  • [10] VIDEO ACTION RECOGNITION WITH SPATIO-TEMPORAL GRAPH EMBEDDING AND SPLINE MODELING
    Yuan, Yin
    Zheng, Haomian
    Li, Zhu
    Zhang, David
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2422 - 2425