Exploiting deep residual networks for human action recognition from skeletal data

被引:46
|
作者
Huy-Hieu Pham [1 ,2 ]
Khoudour, Louandi [1 ]
Crouzil, Alain [2 ]
Zegers, Pablo [3 ]
Velastin, Sergio A. [4 ,5 ]
机构
[1] Ctr Etud & Expertise Risques Environm Mobilite &, F-31400 Toulouse, France
[2] Univ Toulouse, UPS, Inst Rech Informat Toulouse IRIT, F-31062 Toulouse 9, France
[3] Aparnix, La Gioconda 4355,10B, Santiago, Chile
[4] Univ Carlos III Madrid, Appl Artificial Intelligence Res Grp, Dept Comp Sci, Madrid 28270, Spain
[5] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
关键词
3D Action recognition; Deep residual networks; Skeletal data;
D O I
10.1016/j.cviu.2018.03.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB + D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB +D dataset.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [41] Human Action Recognition Using Deep Learning Methods on Limited Sensory Data
    Tufek, Nilay
    Yalcin, Murat
    Altintas, Mucahit
    Kalaoglu, Fatma
    Li, Yi
    Bahadir, Senem Kursun
    IEEE SENSORS JOURNAL, 2020, 20 (06) : 3101 - 3112
  • [42] Deep Facial Action Unit Recognition from Partially Labeled Data
    Wu, Shan
    Wang, Shangfei
    Pan, Bowen
    Ji, Qiang
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 3971 - 3979
  • [43] View Adaptive Recurrent Neural Networks for High Performance Human Action Recognition from Skeleton Data
    Zhang, Pengfei
    Lan, Cuiling
    Xing, Junliang
    Zeng, Wenjun
    Xue, Jianru
    Zheng, Nanning
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 2136 - 2145
  • [44] Deep Learning for Hand Gesture Recognition on Skeletal Data
    Devineau, Guillaume
    Xi, Wang
    Moutarde, Fabien
    Yang, Jie
    PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 106 - 113
  • [45] Residual Gating Fusion Network for Human Action Recognition
    Zhang, Junxuan
    Hu, Haifeng
    BIOMETRIC RECOGNITION, CCBR 2018, 2018, 10996 : 79 - 86
  • [46] Human Action Recognition Using Fusion of Modern Deep Convolutional and Recurrent Neural Networks
    Tkachenko, Dmytro
    2018 IEEE FIRST INTERNATIONAL CONFERENCE ON SYSTEM ANALYSIS & INTELLIGENT COMPUTING (SAIC), 2018, : 181 - 185
  • [47] A survey on deep neural networks for human action recognition in RGB image and depth image
    Wang, Hongyu
    ENERGY SCIENCE AND APPLIED TECHNOLOGY (ESAT 2016), 2016, : 697 - 703
  • [48] Exploiting deep neural networks for detection-based speech recognition
    Siniscalchi, Sabato Marco
    Yu, Dong
    Deng, Li
    Lee, Chin-Hui
    NEUROCOMPUTING, 2013, 106 : 148 - 157
  • [49] Multi-stream with Deep Convolutional Neural Networks for Human Action Recognition in Videos
    Liu, Xiao
    Yang, Xudong
    NEURAL INFORMATION PROCESSING (ICONIP 2018), PT I, 2018, 11301 : 251 - 262
  • [50] Deep Convolutional Neural Networks for Human Action Recognition Using Depth Maps and Postures
    Kamel, Aouaidjia
    Sheng, Bin
    Yang, Po
    Li, Ping
    Shen, Ruimin
    Feng, David Dagan
    IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2019, 49 (09): : 1806 - 1819