Deep Learning Based Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints

被引:31
|
作者
Tasnim, Nusrat [1 ]
Islam, Mohammad Khairul [2 ]
Baek, Joong-Hwan [1 ]
机构
[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang 10540, South Korea
[2] Univ Chittagong, Dept Comp Sci & Engn, Chittagong 4331, Bangladesh
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 06期
关键词
spatio-temporal image formation; human activity recognition; deep learning; fusion strategies; transfer learning; SYSTEM;
D O I
10.3390/app11062675
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human activity recognition has become a significant research trend in the fields of computer vision, image processing, and human-machine or human-object interaction due to cost-effectiveness, time management, rehabilitation, and the pandemic of diseases. Over the past years, several methods published for human action recognition using RGB (red, green, and blue), depth, and skeleton datasets. Most of the methods introduced for action classification using skeleton datasets are constrained in some perspectives including features representation, complexity, and performance. However, there is still a challenging problem of providing an effective and efficient method for human action discrimination using a 3D skeleton dataset. There is a lot of room to map the 3D skeleton joint coordinates into spatio-temporal formats to reduce the complexity of the system, to provide a more accurate system to recognize human behaviors, and to improve the overall performance. In this paper, we suggest a spatio-temporal image formation (STIF) technique of 3D skeleton joints by capturing spatial information and temporal changes for action discrimination. We conduct transfer learning (pretrained models- MobileNetV2, DenseNet121, and ResNet18 trained with ImageNet dataset) to extract discriminative features and evaluate the proposed method with several fusion techniques. We mainly investigate the effect of three fusion methods such as element-wise average, multiplication, and maximization on the performance variation to human action recognition. Our deep learning-based method outperforms prior works using UTD-MHAD (University of Texas at Dallas multi-modal human action dataset) and MSR-Action3D (Microsoft action 3D), publicly available benchmark 3D skeleton datasets with STIF representation. We attain accuracies of approximately 98.93%, 99.65%, and 98.80% for UTD-MHAD and 96.00%, 98.75%, and 97.08% for MSR-Action3D skeleton datasets using MobileNetV2, DenseNet121, and ResNet18, respectively.
引用
收藏
页数:24
相关论文
共 50 条
  • [1] Sensor-Based Human Activity Recognition with Spatio-Temporal Deep Learning
    Nafea, Ohoud
    Abdul, Wadood
    Muhammad, Ghulam
    Alsulaiman, Mansour
    [J]. SENSORS, 2021, 21 (06) : 1 - 20
  • [2] Spatio-temporal hard attention learning for skeleton-based activity recognition
    Nikpour, Bahareh
    Armanfard, Narges
    [J]. PATTERN RECOGNITION, 2023, 139
  • [3] Human Activity Recognition Based on Transfer Learning with Spatio-Temporal Representations
    Zebhi, Saeedeh
    Almodarresi, S. M. T.
    Abootalebi, Vahid
    [J]. INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2021, 18 (06) : 839 - 845
  • [4] Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN
    Mayank Lovanshi
    Vivek Tiwari
    [J]. Multimedia Tools and Applications, 2024, 83 : 12705 - 12730
  • [5] Human skeleton pose and spatio-temporal feature-based activity recognition using ST-GCN
    Lovanshi, Mayank
    Tiwari, Vivek
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (05) : 12705 - 12730
  • [6] Learning Dynamic Spatio-Temporal Relations for Human Activity Recognition
    Liu, Zhenyu
    Yao, Yaqiang
    Liu, Yan
    Zhu, Yuening
    Tao, Zhenchao
    Wang, Lei
    Feng, Yuhong
    [J]. IEEE ACCESS, 2020, 8 : 130340 - 130352
  • [7] LEARNING A HIERARCHICAL SPATIO-TEMPORAL MODEL FOR HUMAN ACTIVITY RECOGNITION
    Xu, Wanru
    Miao, Zhenjiang
    Zhang, Xiao-Ping
    Tian, Yi
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 1607 - 1611
  • [8] SKELETON ACTION RECOGNITION BASED ON SPATIO-TEMPORAL FEATURES
    Huang, Qian
    Xie, Mengting
    Li, Xing
    Wang, Shuaichen
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3284 - 3288
  • [9] Learning Complex Spatio-Temporal Configurations of Body Joints for Online Activity Recognition
    Qi, Jin
    Wang, Zhangjing
    Lin, Xiancheng
    Li, Chunming
    [J]. IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2018, 48 (06) : 637 - 647
  • [10] Skeleton-based Human Action Recognition Using Spatio-Temporal Geometry ( ICCAS 2019)
    Ryu, Hanna
    Kim, Seong-heum
    Hwang, Youngbae
    [J]. 2019 19TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND SYSTEMS (ICCAS 2019), 2019, : 329 - 332