Group Sparse-Based Mid-Level Representation for Action Recognition

被引：17

作者：

Zhang, Shiwei ^{[1
]}

Gao, Changxin ^{[1
]}

Chen, Feifei ^{[1
]}

Luo, Sihui ^{[2
]}

Sang, Nong ^{[1
]}

机构：

[1] Huazhong Univ Sci & Technol, Sch Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China

[2] MediaTek Inc, Wuhan 430000, Peoples R China

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS | 2017年 / 47卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Group sparse (GS); human action recognition; mid-level part; saliency driven max-pooling (SMP); video representation; HUMAN ACTION CATEGORIES; SELECTION; PARTS; FACE;

D O I：

10.1109/TSMC.2016.2625840

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Mid-level parts are shown to be effective for human action recognition in videos. Typically, these semantic parts are first mined with some heuristic rules, then videos are represented via volumetric max-pooling (VMP) method. However, these methods have two issues: 1) the VMP strategy divides videos by static grids. In this case, a semantic part may occur in different localizations in different videos. That means the VMP strategy loses the space-time invariance. To solve this problem, we propose to apply a saliency-driven max-pooling scheme to represent a video. We extract the video semantic cues by the saliency map, and dynamically pool the local maximum responses. This scheme can be considered as a semantic content-based feature alignment method and 2) the parts discovered by heuristic rules may be intuitive but not discriminative enough for action classification because they neglect the relations between the detectors. For this issue, we propose to apply a sparse classifier model to select discriminative parts. Moreover, to further improve the discriminative ability of the representation, we propose to conduct feature selection by the corresponding entry magnitude of the model coefficients. We conduct experiments on four challenging datasets-KTH, Olympic Sports, UCF50, and HMDB51. The results show that the proposed method significantly outperforms the state-of-the-art methods.

引用

页码：660 / 672

页数：13

共 50 条

[1] Learning a Mid-Level Representation for Multiview Action Recognition
Liu, Cuiwei
Li, Zhaokui
Shi, Xiangbin
Du, Chong
ADVANCES IN MULTIMEDIA, 2018, 2018
[2] Unsupervised Deep Learning of Mid-Level Video Representation for Action Recognition
Hou, Jingyi
Wu, Xinxiao
Chen, Jin
Luo, Jiebo
Jia, Yunde
THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6910 - 6917
[3] Action Recognition by Hierarchical Mid-level Action Elements
Lan, Tian
Zhu, Yuke
Zamir, Amir Roshan
Savarese, Silvio
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4552 - 4560
[4] Action Recognition with Discriminative Mid-Level Features
Liu, Cuiwei
Kong, Yu
Wu, Xinxiao
Jia, Yunde
2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3366 - 3369
[5] Learning part-based mid-level representation for visual recognition
Yuan, Baodi
Tu, Jian
Zhao, Rui-Wei
Zheng, Yingbin
Jiang, Yu-Gang
NEUROCOMPUTING, 2018, 275 : 2126 - 2136
[6] Learning a discriminative mid-level feature for action recognition
LIU CuiWei
PEI MingTao
WU XinXiao
KONG Yu
JIA YunDe
Science China(Information Sciences), 2014, 57 (05) : 195 - 207
[7] Action recognition by learning mid-level motion features
Fathi, Alireza
Mori, Greg
2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3064 - 3071
[8] Learning a discriminative mid-level feature for action recognition
Liu CuiWei
Pei MingTao
Wu XinXiao
Kong Yu
Jia YunDe
SCIENCE CHINA-INFORMATION SCIENCES, 2014, 57 (05) : 1 - 13
[9] Learning a discriminative mid-level feature for action recognition
CuiWei Liu
MingTao Pei
XinXiao Wu
Yu Kong
YunDe Jia
Science China Information Sciences, 2014, 57 : 1 - 13
[10] Shape Recognition by Combining Contour and Skeleton into a Mid-Level Representation
Shen, Wei
Wang, Xinggang
Yao, Cong
Bai, Xiang
PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 391 - 400

← 1 2 3 4 5 →