Region-sequence based six-stream CNN features for general and fine-grained human action recognition in videos

被引:64
|
作者
Ma, Miao [1 ,2 ,5 ]
Marturi, Naresh [3 ,6 ]
Li, Yibin [5 ]
Leonardis, Ales [2 ]
Stolkin, Rustam [4 ]
机构
[1] Qingdao Univ, Coll Automat & Elect Engn, Qingdao 266071, Shandong, Peoples R China
[2] Univ Birmingham, Sch Comp Sci, Birmingham B15 2TT, W Midlands, England
[3] Univ Birmingham, Birmingham B15 2TT, W Midlands, England
[4] Univ Birmingham, Robot, Birmingham B15 2TT, W Midlands, England
[5] Shandong Univ, Sch Control Sci & Engn, Jinan 250061, Shandong, Peoples R China
[6] KUKA Robot UK Ltd, Wednesbury Great Western St, Birmingham WS10 7LL, W Midlands, England
基金
英国工程与自然科学研究理事会;
关键词
Human pose; Action recognition; Video understanding; REPRESENTATION; HISTOGRAMS; SYSTEM;
D O I
10.1016/j.patcog.2017.11.026
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper addresses the problems of both general and also fine-grained human action recognition in video sequences. Compared with general human actions, fine-grained action information is more difficult to detect and occupies relatively small-scale image regions. Our work seeks to improve fine-grained action discrimination, while also retaining the ability to perform general action recognition. Our method first estimates human pose and human parts positions in video sequences by extending our recent work on human pose tracking, and crops different scaled patches to obtain richer action information in a variety of different scales of appearance and motion cues. We then utilize a Convolutional Neural Network (CNN) to process each such image patch. Instead of using the output one dimension feature from the full-connection layer, we utilize the outputs of the pooling layer of CNN structure, which contains more spatial information. Then the high dimension of the pooling features is reduced by encoding, to generate the final human action descriptors for classification. Our method reduces feature dimension while also effectively combining appearance and motion information in a unified framework. We have carried out empirical experiments using two publicly available human action datasets, comparing the human action recognition result of our algorithm against six recent state-of-the-art methods from the literature. The results suggest comparatively strong performance of our method. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:506 / 521
页数:16
相关论文
共 50 条
  • [1] Hand Detection and Tracking in Videos for Fine-Grained Action Recognition
    Do, Nga H.
    Yanai, Keiji
    [J]. COMPUTER VISION - ACCV 2014 WORKSHOPS, PT I, 2015, 9008 : 19 - 34
  • [2] Pipelining Localized Semantic Features for Fine-Grained Action Recognition
    Zhou, Yang
    Ni, Bingbing
    Yan, Shuicheng
    Moulin, Pierre
    Tian, Qi
    [J]. COMPUTER VISION - ECCV 2014, PT IV, 2014, 8692 : 481 - 496
  • [3] Fine-Grained Activity Recognition Based on Features of Action Subsegments and Incremental Broad Learning
    Chen, Shi
    Wu, Sheng
    Zhu, Licai
    Yang, Hao
    [J]. ALGORITHMS AND ARCHITECTURES FOR PARALLEL PROCESSING, ICA3PP 2021, PT I, 2022, 13155 : 100 - 114
  • [4] Fine-grained Human Action Recognition Based on Zero-Shot Learning
    Zhao, Yahui
    Shi, Ping
    You, Jian
    [J]. PROCEEDINGS OF 2019 IEEE 10TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2019), 2019, : 294 - 297
  • [5] Fine-Grained Activity Recognition with Holistic and Pose Based Features
    Pishchulin, Leonid
    Andriluka, Mykhaylo
    Schiele, Bernt
    [J]. PATTERN RECOGNITION, GCPR 2014, 2014, 8753 : 678 - 689
  • [6] A General Vocabulary Based Approach for Fine-Grained Object Recognition
    Aich, Shubhra
    Lee, Chil-Woo
    [J]. IMAGE AND VIDEO TECHNOLOGY, PSIVT 2015, 2016, 9431 : 572 - 581
  • [7] CNN-Based Sequence Labeling for Fine-Grained Opinion Mining of Microblogs
    Cheng, Jiajun
    Li, Pei
    Zhang, Xin
    Ding, Zhaoyun
    Wang, Hui
    [J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING, 2017, 2017, 10526 : 94 - 103
  • [8] Hierarchical Joint CNN-Based Models for Fine-Grained Cars Recognition
    Liu, Maolin
    Yu, Chengyue
    Ling, Hefei
    Lei, Jie
    [J]. CLOUD COMPUTING AND SECURITY, ICCCS 2016, PT II, 2016, 10040 : 337 - 347
  • [9] JOINT LEARNING ON THE HIERARCHY REPRESENTATION FOR FINE-GRAINED HUMAN ACTION RECOGNITION
    Leong, Mei Chee
    Tan, Hui Li
    Zhang, Haosong
    Li, Liyuan
    Lin, Feng
    Lim, Joo Hwee
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1059 - 1063
  • [10] Human Action Recognition Using Deep Data: A Fine-Grained Study
    Rao, D. Surendra
    Potturu, Sudharsana Rao
    Bhagyaraju, V
    [J]. INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 97 - 108