Group Sparse-Based Mid-Level Representation for Action Recognition

被引:17
|
作者
Zhang, Shiwei [1 ]
Gao, Changxin [1 ]
Chen, Feifei [1 ]
Luo, Sihui [2 ]
Sang, Nong [1 ]
机构
[1] Huazhong Univ Sci & Technol, Sch Automat, Natl Key Lab Sci & Technol Multispectral Informat, Wuhan 430074, Peoples R China
[2] MediaTek Inc, Wuhan 430000, Peoples R China
基金
中国国家自然科学基金;
关键词
Group sparse (GS); human action recognition; mid-level part; saliency driven max-pooling (SMP); video representation; HUMAN ACTION CATEGORIES; SELECTION; PARTS; FACE;
D O I
10.1109/TSMC.2016.2625840
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Mid-level parts are shown to be effective for human action recognition in videos. Typically, these semantic parts are first mined with some heuristic rules, then videos are represented via volumetric max-pooling (VMP) method. However, these methods have two issues: 1) the VMP strategy divides videos by static grids. In this case, a semantic part may occur in different localizations in different videos. That means the VMP strategy loses the space-time invariance. To solve this problem, we propose to apply a saliency-driven max-pooling scheme to represent a video. We extract the video semantic cues by the saliency map, and dynamically pool the local maximum responses. This scheme can be considered as a semantic content-based feature alignment method and 2) the parts discovered by heuristic rules may be intuitive but not discriminative enough for action classification because they neglect the relations between the detectors. For this issue, we propose to apply a sparse classifier model to select discriminative parts. Moreover, to further improve the discriminative ability of the representation, we propose to conduct feature selection by the corresponding entry magnitude of the model coefficients. We conduct experiments on four challenging datasets-KTH, Olympic Sports, UCF50, and HMDB51. The results show that the proposed method significantly outperforms the state-of-the-art methods.
引用
收藏
页码:660 / 672
页数:13
相关论文
共 50 条
  • [1] Learning a Mid-Level Representation for Multiview Action Recognition
    Liu, Cuiwei
    Li, Zhaokui
    Shi, Xiangbin
    Du, Chong
    ADVANCES IN MULTIMEDIA, 2018, 2018
  • [2] Unsupervised Deep Learning of Mid-Level Video Representation for Action Recognition
    Hou, Jingyi
    Wu, Xinxiao
    Chen, Jin
    Luo, Jiebo
    Jia, Yunde
    THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 6910 - 6917
  • [3] Action Recognition by Hierarchical Mid-level Action Elements
    Lan, Tian
    Zhu, Yuke
    Zamir, Amir Roshan
    Savarese, Silvio
    2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, : 4552 - 4560
  • [4] Action Recognition with Discriminative Mid-Level Features
    Liu, Cuiwei
    Kong, Yu
    Wu, Xinxiao
    Jia, Yunde
    2012 21ST INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR 2012), 2012, : 3366 - 3369
  • [5] Learning part-based mid-level representation for visual recognition
    Yuan, Baodi
    Tu, Jian
    Zhao, Rui-Wei
    Zheng, Yingbin
    Jiang, Yu-Gang
    NEUROCOMPUTING, 2018, 275 : 2126 - 2136
  • [6] Learning a discriminative mid-level feature for action recognition
    LIU CuiWei
    PEI MingTao
    WU XinXiao
    KONG Yu
    JIA YunDe
    Science China(Information Sciences), 2014, 57 (05) : 195 - 207
  • [7] Action recognition by learning mid-level motion features
    Fathi, Alireza
    Mori, Greg
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3064 - 3071
  • [8] Learning a discriminative mid-level feature for action recognition
    Liu CuiWei
    Pei MingTao
    Wu XinXiao
    Kong Yu
    Jia YunDe
    SCIENCE CHINA-INFORMATION SCIENCES, 2014, 57 (05) : 1 - 13
  • [9] Learning a discriminative mid-level feature for action recognition
    CuiWei Liu
    MingTao Pei
    XinXiao Wu
    Yu Kong
    YunDe Jia
    Science China Information Sciences, 2014, 57 : 1 - 13
  • [10] Shape Recognition by Combining Contour and Skeleton into a Mid-Level Representation
    Shen, Wei
    Wang, Xinggang
    Yao, Cong
    Bai, Xiang
    PATTERN RECOGNITION (CCPR 2014), PT I, 2014, 483 : 391 - 400