Deep Learning Based Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints

被引:34
|
作者
Tasnim, Nusrat [1 ]
Islam, Mohammad Khairul [2 ]
Baek, Joong-Hwan [1 ]
机构
[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang 10540, South Korea
[2] Univ Chittagong, Dept Comp Sci & Engn, Chittagong 4331, Bangladesh
来源
APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 06期
关键词
spatio-temporal image formation; human activity recognition; deep learning; fusion strategies; transfer learning; SYSTEM;
D O I
10.3390/app11062675
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
Human activity recognition has become a significant research trend in the fields of computer vision, image processing, and human-machine or human-object interaction due to cost-effectiveness, time management, rehabilitation, and the pandemic of diseases. Over the past years, several methods published for human action recognition using RGB (red, green, and blue), depth, and skeleton datasets. Most of the methods introduced for action classification using skeleton datasets are constrained in some perspectives including features representation, complexity, and performance. However, there is still a challenging problem of providing an effective and efficient method for human action discrimination using a 3D skeleton dataset. There is a lot of room to map the 3D skeleton joint coordinates into spatio-temporal formats to reduce the complexity of the system, to provide a more accurate system to recognize human behaviors, and to improve the overall performance. In this paper, we suggest a spatio-temporal image formation (STIF) technique of 3D skeleton joints by capturing spatial information and temporal changes for action discrimination. We conduct transfer learning (pretrained models- MobileNetV2, DenseNet121, and ResNet18 trained with ImageNet dataset) to extract discriminative features and evaluate the proposed method with several fusion techniques. We mainly investigate the effect of three fusion methods such as element-wise average, multiplication, and maximization on the performance variation to human action recognition. Our deep learning-based method outperforms prior works using UTD-MHAD (University of Texas at Dallas multi-modal human action dataset) and MSR-Action3D (Microsoft action 3D), publicly available benchmark 3D skeleton datasets with STIF representation. We attain accuracies of approximately 98.93%, 99.65%, and 98.80% for UTD-MHAD and 96.00%, 98.75%, and 97.08% for MSR-Action3D skeleton datasets using MobileNetV2, DenseNet121, and ResNet18, respectively.
引用
收藏
页数:24
相关论文
共 50 条
  • [41] Abnormal Activity Recognition Using Spatio-Temporal Features
    Chathuramali, K. G. Manosha
    Ramasinghe, Sameera
    Rodrigo, Ranga
    2014 7TH INTERNATIONAL CONFERENCE ON INFORMATION AND AUTOMATION FOR SUSTAINABILITY (ICIAFS), 2014,
  • [42] Spatio-Temporal Information for Action Recognition in Thermal Video Using Deep Learning Model
    Srihari, P.
    Harikiran, J.
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (08) : 669 - 680
  • [43] Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates
    Liu, Jun
    Shahroudy, Amir
    Xu, Dong
    Kot, Alex C.
    Wang, Gang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (12) : 3007 - 3021
  • [44] Skeleton-based action recognition using spatio-temporal features with convolutional neural networks
    Rostami, Zahra
    Afrasiabi, Mahlagha
    Khotanlou, Hassan
    2017 IEEE 4TH INTERNATIONAL CONFERENCE ON KNOWLEDGE-BASED ENGINEERING AND INNOVATION (KBEI), 2017, : 583 - 587
  • [45] Deep learning based spatio-temporal hand gesture recognition system in complex environment
    Saboo, Shweta
    Singha, Joyeeta
    Laskar, Rabul Hussain
    EXPERT SYSTEMS, 2023, 40 (08)
  • [46] Human Action Recognition Using Spatio-temporal Classification
    Fang, Chin-Hsien
    Chen, Ju-Chin
    Tseng, Chien-Chung
    Lien, Jenn-Jier James
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 98 - 109
  • [47] Efficient Spatio-Temporal Contrastive Learning for Skeleton-Based 3-D Action Recognition
    Gao, Xuehao
    Yang, Yang
    Zhang, Yimeng
    Li, Maosen
    Yu, Jin-Gang
    Du, Shaoyi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 405 - 417
  • [48] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition
    Chen, Tailin
    Zhou, Desen
    Wang, Jian
    Wang, Shidong
    Guan, Yu
    He, Xuming
    Ding, Errui
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 4334 - 4342
  • [49] Human Action Recognition Based on Spatio-temporal Features
    Sawant, Nikhil
    Biswas, K. K.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 357 - 362
  • [50] Spatio-Temporal Phrases for Activity Recognition
    Zhang, Yimeng
    Liu, Xiaoming
    Chang, Ming-Ching
    Ge, Weina
    Chen, Tsuhan
    COMPUTER VISION - ECCV 2012, PT III, 2012, 7574 : 707 - 721