Exploiting deep residual networks for human action recognition from skeletal data

被引:45
|
作者
Huy-Hieu Pham [1 ,2 ]
Khoudour, Louandi [1 ]
Crouzil, Alain [2 ]
Zegers, Pablo [3 ]
Velastin, Sergio A. [4 ,5 ]
机构
[1] Ctr Etud & Expertise Risques Environm Mobilite &, F-31400 Toulouse, France
[2] Univ Toulouse, UPS, Inst Rech Informat Toulouse IRIT, F-31062 Toulouse 9, France
[3] Aparnix, La Gioconda 4355,10B, Santiago, Chile
[4] Univ Carlos III Madrid, Appl Artificial Intelligence Res Grp, Dept Comp Sci, Madrid 28270, Spain
[5] Queen Mary Univ London, Sch Elect Engn & Comp Sci, London E1 4NS, England
关键词
3D Action recognition; Deep residual networks; Skeletal data;
D O I
10.1016/j.cviu.2018.03.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to capture the spatial-temporal evolutions of 3D motions from skeleton sequences and can be efficiently learned by D-CNNs. We then propose a novel deep learning architecture based on ResNets to learn features from obtained color-based representations and classify them into action classes. The proposed method is evaluated on three challenging benchmark datasets including MSR Action 3D, KARD, and NTU-RGB + D datasets. Experimental results demonstrate that our method achieves state-of-the-art performance for all these benchmarks whilst requiring less computation resource. In particular, the proposed method surpasses previous approaches by a significant margin of 3.4% on MSR Action 3D dataset, 0.67% on KARD dataset, and 2.5% on NTU-RGB +D dataset.
引用
收藏
页码:51 / 66
页数:16
相关论文
共 50 条
  • [1] Deep Residual Temporal Convolutional Networks for Skeleton-Based Human Action Recognition
    Khamsehashari, R.
    Gadzicki, K.
    Zetzsche, C.
    [J]. COMPUTER VISION SYSTEMS (ICVS 2019), 2019, 11754 : 376 - 385
  • [2] Human Action Recognition Using Deep Neural Networks
    Koli, Rashmi R.
    Bagban, Tanveer, I
    [J]. PROCEEDINGS OF THE 2020 FOURTH WORLD CONFERENCE ON SMART TRENDS IN SYSTEMS, SECURITY AND SUSTAINABILITY (WORLDS4 2020), 2020, : 376 - 380
  • [3] A Deep Learning Approach for Real-Time 3D Human Action Recognition from Skeletal Data
    Huy Hieu Pham
    Salmane, Houssam
    Khoudour, Louandi
    Crouzil, Alain
    Zegers, Pablo
    Velastin, Sergio A.
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2019, PT I, 2019, 11662 : 18 - 32
  • [4] Deep Residual Networks for Human Activity Recognition based on Biosignals from Wearable Devices
    Mekruksavanich, Sakorn
    Jantawong, Ponnipa
    Jitpattanakul, Anuchit
    [J]. 2022 45TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS AND SIGNAL PROCESSING, TSP, 2022, : 310 - 313
  • [5] A Deep Learning Approach for Human Action Recognition Using Skeletal Information
    Mathe, Eirini
    Maniatis, Apostolos
    Spyrou, Evaggelos
    Mylonas, Phivos
    [J]. GENEDIS 2018: COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2020, 1194 : 105 - 114
  • [6] Deep Residual Split Directed Graph Convolutional Neural Networks for Action Recognition
    Fu, Bo
    Fu, Shilin
    Wang, Liyan
    Dong, Yuhan
    Ren, Yonggong
    [J]. IEEE MULTIMEDIA, 2020, 27 (04) : 9 - 17
  • [7] Deep Metric Learning for Human Action Recognition with SlowFast Networks
    Shi, Shanmeng
    Jung, Cheolkon
    [J]. 2021 INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2021,
  • [8] Exploiting Privileged Information from Web Data for Action and Event Recognition
    Li Niu
    Wen Li
    Dong Xu
    [J]. International Journal of Computer Vision, 2016, 118 : 130 - 150
  • [9] Exploiting Privileged Information from Web Data for Action and Event Recognition
    Niu, Li
    Li, Wen
    Xu, Dong
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2016, 118 (02) : 130 - 150
  • [10] Action Recognition with Skeletal Volume and Deep Learning
    Keceli, Ali Seydi
    Kaya, Aydin
    Can, Ahmct Burak
    [J]. 2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,