Exploiting deep residual networks for human action recognition from skeletal data

被引:46
|
作者
Huy-Hieu Pham [1 ,2 ]
Khoudour, Louandi [1 ]
Crouzil, Alain [2 ]
Zegers, Pablo [3 ]
Velastin, Sergio A. [4 ,5 ]
机构
[1] Ctr Etud & Expertise Risques Environm Mobilite &, F-31400 Toulouse, France
[2] Univ Toulouse, UPS, Inst Rech Informat Toulouse IRIT, F-31062 Toulouse 9, France
[3] Aparnix, La Gioconda 4355,10B, Santiago, Chile
[4] Univ Carlos III Madrid, Appl Artificial Intelligence Res Grp, Dept Comp Sci, Madrid 28270, Spain
[5] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
关键词
3D Action recognition; Deep residual networks; Skeletal data;
D O I
10.1016/j.cviu.2018.03.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB + D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB +D dataset.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [31] Deep Learning for Human Action Recognition
    Shekokar, R. U.
    Kale, S. N.
    2021 6TH INTERNATIONAL CONFERENCE FOR CONVERGENCE IN TECHNOLOGY (I2CT), 2021,
  • [32] Stratified pooling based deep convolutional neural networks for human action recognition
    Sheng Yu
    Yun Cheng
    Songzhi Su
    Guorong Cai
    Shaozi Li
    Multimedia Tools and Applications, 2017, 76 : 13367 - 13382
  • [33] Modelling Human Body Pose for Action Recognition Using Deep Neural Networks
    Chengyang Li
    Ruofeng Tong
    Min Tang
    Arabian Journal for Science and Engineering, 2018, 43 : 7777 - 7788
  • [34] A Survey on Deep Neural Networks for Human Action Recognition based on Skeleton Information
    Wang, Hongyu
    RECENT DEVELOPMENTS IN INTELLIGENT SYSTEMS AND INTERACTIVE APPLICATIONS (IISA2016), 2017, 541 : 329 - 336
  • [35] Deep multiple aggregation networks for action recognition
    Ahmed Mazari
    Hichem Sahbi
    International Journal of Multimedia Information Retrieval, 2024, 13
  • [36] Deep multiple aggregation networks for action recognition
    Mazari, Ahmed
    Sahbi, Hichem
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2024, 13 (01)
  • [37] Revolutionizing Image Recognition and Beyond with Deep Residual Networks
    Baraneedharan, P.
    Nithyasri, A.
    Keerthana, P.
    COMMUNICATION AND INTELLIGENT SYSTEMS, VOL 1, ICCIS 2023, 2024, 967 : 441 - 448
  • [38] SKELETAL MOVEMENT TO COLOR MAP: A NOVEL REPRESENTATION FOR 3D ACTION RECOGNITION WITH INCEPTION RESIDUAL NETWORKS
    Huy-Hieu Pham
    Khoudour, Louahdi
    Crouzil, Alain
    Zegers, Pablo
    Velastin, Sergio A.
    2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3483 - 3487
  • [39] Multi-Modal Human Action Recognition Using Deep Neural Networks Fusing Image and Inertial Sensor Data
    Hwang, Inhwan
    Cha, Geonho
    Oh, Songhwai
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS (MFI), 2017, : 278 - 283
  • [40] Human Action Recognition Using Deep Data: A Fine-Grained Study
    Rao, D. Surendra
    Potturu, Sudharsana Rao
    Bhagyaraju, V
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2022, 22 (06): : 97 - 108