Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network

被引:0
|
作者
Haiping Zhang
Xu Liu
Dongjin Yu
Liming Guan
Dongjing Wang
Conghao Ma
Zepeng Hu
机构
[1] Hangzhou Dianzi University,School of Computer Science
[2] Hangzhou Dianzi University,School of Information Engineering
[3] Hangzhou Dianzi University,School of Electronics and Information
来源
Applied Intelligence | 2023年 / 53卷
关键词
Action recognition; Skeleton; GCN; Multi-stream network;
D O I
暂无
中图分类号
学科分类号
摘要
Action recognition techniques based on skeleton data are receiving more and more attention in the field of computer vision due to their ability to adapt to dynamic environments and complex backgrounds. Topologizing human skeleton data as spatial-temporal graphs and processing them using graph convolutional networks (GCNs) has been shown to produce good recognition results. However, with existing GCN methods, a fixed-size convolution kernel is often used to extract time-domain features, which may not be very suitable for multi-level model structures. Equal proportion fusion of different streams in a multi-stream network may ignore the difference in recognition ability of different streams, and these will affect the final recognition result. In this paper, we are proposing (1) a multi-scale dilated temporal graph convolution layer (MDTGCL) and (2) a multi-branch feature fusion (MFF) structure. The MDTGCL utilizes multiple convolution kernels and dilated convolution to better adapt to the multi-layer structure of the GCN model and to obtain longer periods of contextual spatial-temporal information, resulting in richer behavioural features. MFF entails weighted fusion based on the results of multi-stream outputs, and this is used to obtain the final recognition results. As higher-order skeleton data are highly discriminative and more conducive to human action recognition, we used spatial information on joints and bones and their multiple motion, as well as angle information pertaining to bones, to model together in this study. By combining the above, we designed a multi-stream, multi-scale dilated spatial-temporal graph convolutional network (2M-STGCN) model and conducted extensive experiments with two large datasets (NTU RGB+D 60 and Kinetics Skeleton 400), which showed that our model performs at SOTA level.
引用
收藏
页码:17629 / 17643
页数:14
相关论文
共 50 条
  • [1] Skeleton-based action recognition with multi-stream, multi-scale dilated spatial-temporal graph convolution network
    Zhang, Haiping
    Liu, Xu
    Yu, Dongjin
    Guan, Liming
    Wang, Dongjing
    Ma, Conghao
    Hu, Zepeng
    APPLIED INTELLIGENCE, 2023, 53 (14) : 17629 - 17643
  • [2] Multi-Stream and Enhanced Spatial-Temporal Graph Convolution Network for Skeleton-Based Action Recognition
    Li, Fanjia
    Zhu, Aichun
    Xu, Yonggang
    Cui, Ran
    Hua, Gang
    IEEE ACCESS, 2020, 8 : 97757 - 97770
  • [3] Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition
    Yu, Lubin
    Tian, Lianfang
    Du, Qiliang
    Bhutto, Jameel Ahmed
    IET COMPUTER VISION, 2022, 16 (02) : 143 - 158
  • [4] Skeleton-Based Action Recognition Using Multi-Scale and Multi-Stream Improved Graph Convolutional Network
    Li, Wang
    Liu, Xu
    Liu, Zheng
    Du, Feixiang
    Zou, Qiang
    IEEE ACCESS, 2020, 8 (08): : 144529 - 144542
  • [5] Multi-Scale Spatial Temporal Graph Convolutional Network for Skeleton-Based Action Recognition
    Chen, Zhan
    Li, Sicheng
    Yang, Bing
    Li, Qinghan
    LiU, Hong
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1113 - 1122
  • [6] Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition
    Feng, Dong
    Wu, ZhongCheng
    Zhang, Jun
    Ren, TingTing
    IEEE ACCESS, 2021, 9 : 58256 - 58265
  • [7] Multi-scale spatial-temporal convolutional neural network for skeleton-based action recognition
    Cheng, Qin
    Cheng, Jun
    Ren, Ziliang
    Zhang, Qieshi
    Liu, Jianming
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 1303 - 1315
  • [8] Multi-Scale Adaptive Graph Convolution Network for Skeleton-Based Action Recognition
    Hu, Huangshui
    Fang, Yue
    Han, Mei
    Qi, Xingshuo
    IEEE ACCESS, 2024, 12 : 16868 - 16880
  • [9] Multi-Branch Spatial-Temporal Attention Graph Convolution Network for Skeleton-based Action Recognition
    Wang, Daoshuai
    Li, Dewei
    Guan, Yaonan
    Wang, Gang
    Shao, Haibin
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6487 - 6492
  • [10] Multi-Scale Mixed Dense Graph Convolution Network for Skeleton-Based Action Recognition
    Xia, Hailun
    Gao, Xinkai
    IEEE ACCESS, 2021, 9 (09): : 36475 - 36484