Deep Learning Based Human Activity Recognition Using Spatio-Temporal Image Formation of Skeleton Joints

被引：34

作者：

Tasnim, Nusrat ^{[1
]}

Islam, Mohammad Khairul ^{[2
]}

Baek, Joong-Hwan ^{[1
]}

机构：

[1] Korea Aerosp Univ, Sch Elect & Informat Engn, Goyang 10540, South Korea

[2] Univ Chittagong, Dept Comp Sci & Engn, Chittagong 4331, Bangladesh

来源：

APPLIED SCIENCES-BASEL | 2021年 / 11卷 / 06期

关键词：

spatio-temporal image formation; human activity recognition; deep learning; fusion strategies; transfer learning; SYSTEM;

D O I：

10.3390/app11062675

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

Human activity recognition has become a significant research trend in the fields of computer vision, image processing, and human-machine or human-object interaction due to cost-effectiveness, time management, rehabilitation, and the pandemic of diseases. Over the past years, several methods published for human action recognition using RGB (red, green, and blue), depth, and skeleton datasets. Most of the methods introduced for action classification using skeleton datasets are constrained in some perspectives including features representation, complexity, and performance. However, there is still a challenging problem of providing an effective and efficient method for human action discrimination using a 3D skeleton dataset. There is a lot of room to map the 3D skeleton joint coordinates into spatio-temporal formats to reduce the complexity of the system, to provide a more accurate system to recognize human behaviors, and to improve the overall performance. In this paper, we suggest a spatio-temporal image formation (STIF) technique of 3D skeleton joints by capturing spatial information and temporal changes for action discrimination. We conduct transfer learning (pretrained models- MobileNetV2, DenseNet121, and ResNet18 trained with ImageNet dataset) to extract discriminative features and evaluate the proposed method with several fusion techniques. We mainly investigate the effect of three fusion methods such as element-wise average, multiplication, and maximization on the performance variation to human action recognition. Our deep learning-based method outperforms prior works using UTD-MHAD (University of Texas at Dallas multi-modal human action dataset) and MSR-Action3D (Microsoft action 3D), publicly available benchmark 3D skeleton datasets with STIF representation. We attain accuracies of approximately 98.93%, 99.65%, and 98.80% for UTD-MHAD and 96.00%, 98.75%, and 97.08% for MSR-Action3D skeleton datasets using MobileNetV2, DenseNet121, and ResNet18, respectively.

引用

页数：24

共 50 条

[21] Human Activity Recognition: A Spatio-temporal Image Encoding of 3D Skeleton Data for Online Action Detection
Mokhtari, Nassim
Nedelec, Alexis
De Loor, Pierre
PROCEEDINGS OF THE 17TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS (VISAPP), VOL 5, 2022, : 448 - 455
[22] A Spatio-Temporal Deep Learning Approach For Human Action Recognition in Infrared Videos
Shah, Anuj K.
Ghosh, Ripul
Akula, Aparna
OPTICS AND PHOTONICS FOR INFORMATION PROCESSING XII, 2018, 10751
[23] Human Action Recognition by Learning Spatio-Temporal Features With Deep Neural Networks
Wang, Lei
Xu, Yangyang
Cheng, Jun
Xia, Haiying
Yin, Jianqin
Wu, Jiaji
IEEE ACCESS, 2018, 6 : 17913 - 17922
[24] SPATIO-TEMPORAL MULTI-SCALE SOFT QUANTIZATION LEARNING FOR SKELETON-BASED HUMAN ACTION RECOGNITION
Yang, Jianyu
Zhu, Chen
Yuan, Junsong
2019 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2019, : 1078 - 1083
[25] STFC: Spatio-temporal feature chain for skeleton-based human action recognition
Ding, Wenwen
Liu, Kai
Cheng, Fei
Zhang, Jin
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2015, 26 : 329 - 337
[26] Action Recognition Based on Efficient Deep Feature Learning in the Spatio-Temporal Domain
Husain, Farzad
Dellen, Babette
Torras, Carme
IEEE ROBOTICS AND AUTOMATION LETTERS, 2016, 1 (02): : 984 - 991
[27] Vehicle recognition based on spatio-temporal image analysis
Hirahara, K
Ikeuchi, K
ITSC 2004: 7TH INTERNATIONAL IEEE CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, PROCEEDINGS, 2004, : 725 - 730
[28] Deep Spatio-Temporal Mutual Learning for EEG Emotion Recognition
Ye, Wenqing
Li, Xinyu
Zhang, Haokun
Zhu, Zhuolin
Li, Dongdong
2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
[29] A Hierarchical Spatio-Temporal Model for Human Activity Recognition
Xu, Wanru
Miao, Zhenjiang
Zhang, Xiao-Ping
Tian, Yi
IEEE TRANSACTIONS ON MULTIMEDIA, 2017, 19 (07) : 1494 - 1509
[30] Spatio-temporal stacking model for skeleton-based action recognition
Zhong, Yufeng
Yan, Qiuyan
APPLIED INTELLIGENCE, 2022, 52 (11) : 12116 - 12130

← 1 2 3 4 5 →