SkeletonNet: Mining Deep Part Features for 3-D Action Recognition

被引:135
|
作者
Ke, Qiuhong [1 ]
An, Senjian [1 ]
Bennamoun, Mohammed [1 ]
Sohel, Ferdous [2 ]
Boussaid, Farid [3 ]
机构
[1] Univ Western Australia, Sch Comp Sci & Software Engn, Crawley, WA 6009, Australia
[2] Murdoch Univ, Sch Engn & Informat Technol, Murdoch, WA 6150, Australia
[3] Univ Western Australia, Sch Elect Elect & Comp Engn, Crawley, WA 6009, Australia
基金
澳大利亚研究理事会;
关键词
Convolutional neural networks (CNNs); robust features; 3-D action recognition; REAL-TIME; TRACKING; RGB;
D O I
10.1109/LSP.2017.2690339
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
This letter presents SkeletonNet, a deep learning framework for skeleton-based 3-D action recognition. Given a skeleton sequence, the spatial structure of the skeleton joints in each frame and the temporal information between multiple frames are two important factors for action recognition. We first extract body-part-based features from each frame of the skeleton sequence. Compared to the original coordinates of the skeleton joints, the proposed features are translation, rotation, and scale invariant. To learn robust temporal information, instead of treating the features of all frames as a time series, we transform the features into images and feed them to the proposed deep learning network, which contains two parts: one to extract general features from the input images, while the other to generate a discriminative and compact representation for action recognition. The proposed method is tested on the SBU kinect interaction dataset, the CMU dataset, and the large-scale NTU RGB+D dataset and achieves state-of-the-art performance.
引用
收藏
页码:731 / 735
页数:5
相关论文
共 50 条
  • [11] 3-D face recognition: features, databases, algorithms and challenges
    Patil, Hemprasad
    Kothari, Ashwin
    Bhurchandi, Kishor
    ARTIFICIAL INTELLIGENCE REVIEW, 2015, 44 (03) : 393 - 441
  • [12] Mining Deep Part Features for Pedestrian Search
    Zhang, Jin
    Yang, Xuequan
    Zhao, Yan
    Shen, Shjie
    2020 5TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2020), 2020, : 2061 - 2064
  • [13] DeepPano: Deep Panoramic Representation for 3-D Shape Recognition
    Shi, Baoguang
    Bai, Song
    Zhou, Zhichao
    Bai, Xiang
    IEEE SIGNAL PROCESSING LETTERS, 2015, 22 (12) : 2339 - 2343
  • [14] 3-D ACTION
    BALDAZO, R
    BYTE, 1995, 20 (12): : 123 - &
  • [15] 3D GLOH Features for Human Action Recognition
    Abdulmunem, Ashwan
    Lai, Yu-Kun
    Sun, Xianfang
    2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 805 - 810
  • [16] VLAD3: Encoding Dynamics of Deep Features for Action Recognition
    Li, Yingwei
    Li, Weixin
    Mahadevan, Vijay
    Vasconcelos, Nuno
    2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1951 - 1960
  • [17] Attribute Mining for Scalable 3D Human Action Recognition
    Cai, Xingyang
    Zhou, Wengang
    Li, Houqiang
    MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1075 - 1078
  • [18] A Group Sparsity-Driven Approach to 3-D Action Recognition
    Cosar, Serhan
    Cetin, Mujdat
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [19] Action and Gait Recognition From Recovered 3-D Human Joints
    Gu, Junxia
    Ding, Xiaoqing
    Wang, Shengjin
    Wu, Youshou
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2010, 40 (04): : 1021 - 1033
  • [20] S3DRGF: Spatial 3-D Relational Geometric Features for 3-D Sign Language Representation and Recognition
    Kumar, D. Anil
    Sastry, A. S. C. S.
    Kishore, P. V. V.
    Kumar, E. Kiran
    Kutnar, M. Teja Kiran
    IEEE SIGNAL PROCESSING LETTERS, 2019, 26 (01) : 169 - 173