Skeleton-based action recognition based on spatio-temporal adaptive graph convolutional neural-network

被引:0
|
作者
Cao Y. [1 ,2 ]
Liu C. [1 ]
Huang Z. [1 ]
Sheng Y. [1 ]
机构
[1] School of Mechanical Engineering, Jiangnan University, Wuxi
[2] Jiangsu Key Laboratory of Advanced Food Manufacturing Equipment and Technology, Jiangnan University, Wuxi
关键词
Non-local structure; Skeleton-based action recognition; Spatio-temporal adaptive graph convolution; Temporal action graph; Temporal adaptive graph convolution;
D O I
10.13245/j.hust.201102
中图分类号
学科分类号
摘要
To solve the problems of lacking capability to model global context temporal information, insufficient classification accuracy and generalization ability in skeleton-based action recognition, a temporal modeling feature of temporal action graph and a skeleton-based action recognition model based on spatio-temporal adaptive graph convolutional neural-network (ST-AGCN) were proposed. First, graph representation theory and skeleton sequences were introduced, and temporal action graph and adjacent matrix based on N-order fixed-time structure was designed. Then, temporal adaptive graph convolutional network (T-AGCN) structure based on non-local construction combined with graph convolution theory was proposed. Furthermore, through combining T-AGCN with spatial adaptive graph convolutional network (S-AGCN) structure, the skeleton-based action recognition model based on ST-AGCN was proposed. Finally, researches on skeleton-based action recognition were conducted on the NTU-RGB+D and SBU dataset to validate advantages of model in modeling global context temporal information, classification accuracy and generalization ability. Experiment results show that the model could achieve highest accuracy of 92.1% and 99.5% respectively on the above-mentioned two datasets, demonstrating the excellent accuracy and better generalization ability of the model. © 2020, Editorial Board of Journal of Huazhong University of Science and Technology. All right reserved.
引用
收藏
页码:5 / 10
页数:5
相关论文
共 18 条
  • [1] LIU Z, ZHANG H, CHEN Z, Et al., Disentangling and unifying graph convolutions for skeleton-based action recognition, Proc of IEEE Conference on Computer Vision and Pattern Recognition, pp. 140-149, (2020)
  • [2] LI M, CHEN S, CHEN X, Et al., Actional-structural graph convolutional networks for skeleton-based action recognition, Proc of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3590-3598, (2019)
  • [3] 2
  • [4] AGGARWAL J K, RYOO M S., Human activity analysis: a review, ACM Computing Surveys, 43, 3, pp. 16-28, (2011)
  • [5] TAE S K, AUSTIN R., Interpretable 3D human action analysis with temporal convolutional networks, Proc of IEEE Computer Vision and Pattern Recognition Workshops, pp. 1623-1631, (2017)
  • [6] LIU H, TU J, LIU M., Two-stream 3D convolutional neural network for skeleton-based action recognition
  • [7] OORD A V D, DIELEMAN S, ZEN H, Et al., Wavenet: a generative model for raw audio
  • [8] LI C, ZHONG Q, XIE D, Et al., Co-occurrence feature learning from skeleton data for action recognition and detection with hierarchical aggregation, Proc of International Joint Conference on Artificial Intelligence, pp. 1123-1131, (2018)
  • [9] ZHANG P, LAN C, XING J, Et al., View adaptive neural networks for high performance skeleton-based human action recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 8, pp. 1963-1978, (2018)
  • [10] LIU J, SHAHROUDY A, XU D, Et al., Spatio-temporal LSTM with trust gates for 3d human action recognition, Proc of European Conference on Computer Vision, pp. 816-833, (2016)