Hierarchical Spatio-Temporal Context Modeling for Action Recognition

被引：0

作者：

Sun, Ju ^{[1
]}

Wu, Xiao ^{[2
]}

Yan, Shuicheng ^{[3
]}

Cheong, Loong-Fah ^{[3
]}

Chua, Tat-Seng ^{[4
]}

Li, Jintao ^{[2
]}

机构：

[1] Natl Univ Singapore, Interact & Digital Media Inst, Singapore 117548, Singapore

[2] Chinese Acad Sci, Inst Comp Technol, Beijing 100864, Peoples R China

[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117548, Singapore

[4] Natl Univ Singapore, Sch Comp, Singapore 117548, Singapore

来源：

CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4 | 2009年

基金：

新加坡国家研究基金会;

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The problem of recognizing actions in realistic videos is challenging yet absorbing owing to its great potentials in many practical applications. Most previous research is limited due to the use of simplified action databases under controlled environments or focus on excessively localized features without sufficiently encapsulating the spatio-temporal context. in this paper we propose to model the spatio-temporal context information in a hierarchical way, where three levels of context are exploited in ascending order of abstraction: 1) point-level context (SIFT average descriptor), 2) intra-trajectory context (trajectory transition descriptor), and 3) inter-trajectory context (trajectory proximity descriptor). To obtain efficient and compact representations for the latter two levels, we encode the spatio-temporal context information into the transition matrix of a Markov process, and then extract its stationary distribution as the final context descriptor Building on the multi-channel nonlinear SVMs, we validate this proposed hierarchical framework on the realistic action (HOHA) and event ( LSCOM) recognition databases, and achieve 27% and 66% relative performance improvements over the state-op the-art results, respectively. We further propose to employ the Multiple Kernel Learning (MKL) technique to prune the kernels towards speedup in algorithm evaluation.

引用

页码：2004 / +

页数：2

共 50 条

[1] Projection transform on spatio-temporal context for action recognition
Wanru Xu
Zhenjiang Miao
Qiang Zhang
Multimedia Tools and Applications, 2015, 74 : 7711 - 7728
[2] Projection transform on spatio-temporal context for action recognition
Xu, Wanru
Miao, Zhenjiang
Zhang, Qiang
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (18) : 7711 - 7728
[3] Hierarchical and Spatio-Temporal Sparse Representation for Human Action Recognition
Tian, Yi
Kong, Yu
Ruan, Qiuqi
An, Gaoyun
Fu, Yun
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (04) : 1748 - 1762
[4] Vertex Feature Encoding and Hierarchical Temporal Modeling in a Spatio-Temporal Graph Convolutional Network for Action Recognition
Papadopoulos, Konstantinos
Ghorbel, Enjie
Aouada, Djamila
Ottersten, Bjoern
2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 452 - 458
[5] Spatio-Temporal Motion Field Descriptors for The Hierarchical Action Recognition System
Bao, Ruihan
Shibata, Tadashi
5TH INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATION SYSTEMS, ICSPCS'2011, 2011,
[6] Interest Point Selection with Spatio-Temporal Context for Realistic Action Recognition
Shan, Yanhu
Zhang, Zhang
Zhang, Junge
Huang, Kaiqi
Wu, Na
Hyun, Oh Se
2012 IEEE NINTH INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL-BASED SURVEILLANCE (AVSS), 2012, : 94 - 99
[7] Modeling spatio-temporal layout with Lie Algebrized Gaussians for action recognition
Chen, Meng
Gong, Liyu
Wang, Tianjiang
Liu, Fang
Feng, Qi
MULTIMEDIA TOOLS AND APPLICATIONS, 2016, 75 (17) : 10335 - 10355
[8] Modeling spatio-temporal layout with Lie Algebrized Gaussians for action recognition
Meng Chen
Liyu Gong
Tianjiang Wang
Fang Liu
Qi Feng
Multimedia Tools and Applications, 2016, 75 : 10335 - 10355
[9] Spatio-temporal Relation Modeling for Few-shot Action Recognition
Thatipelli, Anirudh
Narayan, Sanath
Khan, Salman
Anwer, Rao Muhammad
Khan, Fahad Shahbaz
Ghanem, Bernard
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 19926 - 19935
[10] VIDEO ACTION RECOGNITION WITH SPATIO-TEMPORAL GRAPH EMBEDDING AND SPLINE MODELING
Yuan, Yin
Zheng, Haomian
Li, Zhu
Zhang, David
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 2422 - 2425

← 1 2 3 4 5 →