Skeleton-Based Action Recognition With Multi-Stream Adaptive Graph Convolutional Networks

被引:292
|
作者
Shi, Lei [1 ,2 ,3 ]
Zhang, Yifan [1 ,2 ,3 ]
Cheng, Jian [1 ,2 ,3 ]
Lu, Hanqing [1 ,2 ,3 ]
机构
[1] Chinese Acad Sci, NLPR, Beijing 100049, Peoples R China
[2] Chinese Acad Sci, Inst Automat, AIRIA, Beijing 100049, Peoples R China
[3] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
基金
中国国家自然科学基金;
关键词
Adaptation models; Joints; Data models; Bones; Spatiotemporal phenomena; Task analysis; Skeleton-based action recognition; graph convolutional network; adaptive graph; multi-stream network;
D O I
10.1109/TIP.2020.3028207
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Graph convolutional networks (GCNs), which generalize CNNs to more generic non-Euclidean structures, have achieved remarkable performance for skeleton-based action recognition. However, there still exist several issues in the previous GCN-based models. First, the topology of the graph is set heuristically and fixed over all the model layers and input data. This may not be suitable for the hierarchy of the GCN model and the diversity of the data in action recognition tasks. Second, the second-order information of the skeleton data, i.e., the length and orientation of the bones, is rarely investigated, which is naturally more informative and discriminative for the human action recognition. In this work, we propose a novel multi-stream attention-enhanced adaptive graph convolutional neural network (MS-AAGCN) for skeleton-based action recognition. The graph topology in our model can be either uniformly or individually learned based on the input data in an end-to-end manner. This data-driven approach increases the flexibility of the model for graph construction and brings more generality to adapt to various data samples. Besides, the proposed adaptive graph convolutional layer is further enhanced by a spatial-temporal-channel attention module, which helps the model pay more attention to important joints, frames and features. Moreover, the information of both the joints and bones, together with their motion information, are simultaneously modeled in a multi-stream framework, which shows notable improvement for the recognition accuracy. Extensive experiments on the two large-scale datasets, NTU-RGBD and Kinetics-Skeleton, demonstrate that the performance of our model exceeds the state-of-the-art with a significant margin.
引用
收藏
页码:9532 / 9545
页数:14
相关论文
共 50 条
  • [1] Multi-stream mixed graph convolutional networks for skeleton-based action recognition
    Zhuang, Boyuan
    Kong, Jun
    Jiang, Min
    Liu, Tianshan
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2021, 30 (06)
  • [2] Multi-stream slowFast graph convolutional networks for skeleton-based action recognition
    Sun, Ning
    Leng, Ling
    Liu, Jixin
    Han, Guang
    [J]. IMAGE AND VISION COMPUTING, 2021, 109
  • [3] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Chen, Minglong
    Liang, Jiuzhen
    Liu, Hao
    [J]. JOURNAL OF SUPERCOMPUTING, 2024, 80 (08): : 11614 - 11639
  • [4] Multi-stream P&U adaptive graph convolutional networks for skeleton-based action recognition
    Minglong Chen
    Jiuzhen Liang
    Hao Liu
    [J]. The Journal of Supercomputing, 2024, 80 : 11614 - 11639
  • [5] Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition
    Kong, Jun
    Wang, Shengquan
    Jiang, Min
    Liu, TianShan
    [J]. NEURAL COMPUTING & APPLICATIONS, 2023, 35 (25): : 18487 - 18504
  • [6] Multi-stream ternary enhanced graph convolutional network for skeleton-based action recognition
    Jun Kong
    Shengquan Wang
    Min Jiang
    TianShan Liu
    [J]. Neural Computing and Applications, 2023, 35 : 18487 - 18504
  • [7] Multi-stream adaptive spatial-temporal attention graph convolutional network for skeleton-based action recognition
    Yu, Lubin
    Tian, Lianfang
    Du, Qiliang
    Bhutto, Jameel Ahmed
    [J]. IET COMPUTER VISION, 2022, 16 (02) : 143 - 158
  • [8] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12018 - 12027
  • [9] Multi-stream part-fused graph convolutional networks for skeleton-based gait recognition
    Wang, Likai
    Chen, Jinyan
    Chen, Zhenghang
    Liu, Yuxin
    Yang, Haolin
    [J]. CONNECTION SCIENCE, 2022, 34 (01) : 652 - 669
  • [10] A Multi-Stream Graph Convolutional Networks-Hidden Conditional Random Field Model for Skeleton-Based Action Recognition
    Liu, Kai
    Gao, Lei
    Khan, Naimul Mefraz
    Qi, Lin
    Guan, Ling
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 64 - 76