Two-stream spatial-temporal neural networks for pose-based action recognition

被引:2
|
作者
Wang, Zixuan [1 ]
Zhu, Aichun [1 ,2 ]
Hu, Fangqiang [1 ]
Wu, Qianyu [1 ]
Li, Yifeng [1 ]
机构
[1] Nanjing Tech Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China
[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
action recognition; pose estimation; convolutional neural network; long short-term memory;
D O I
10.1117/1.JEI.29.4.043025
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
With recent advances in human pose estimation and human skeleton capture systems, pose-based action recognition has drawn lots of attention among researchers. Although most existing action recognition methods are based on convolutional neural network and long short-term memory, which present outstanding performance, one of the shortcomings of these methods is that they lack the ability to explicitly exploit the rich spatial-temporal information between the skeletons in the behavior, so they are not conducive to improving the accuracy of action recognition. To better address this issue, the two-stream spatial-temporal neural networks for pose-based action recognition is introduced. First, the pose features that are extracted from the raw video are processed by an action modeling module. Then, the temporal information and the spatial information, in the form of relative speed and relative distance, are fed into the temporal neural network and the spatial neural network, respectively. Afterward, the outputs of two-stream networks are fused for better action recognition. Finally, we perform comprehensive experiments on the SUB-JHMDB, SYSU, MPII-Cooking, and NTU RGB+D datasets, the results of which demonstrate the effectiveness of the proposed model. (C) 2020 SPIE and IS&T
引用
收藏
页数:16
相关论文
共 50 条
  • [41] Skeleton-based Action Recognition Using Two-stream Graph Convolutional Network with Pose Refinement
    Zheng, Biao
    Chen, Luefeng
    Wu, Min
    Pedrycz, Witold
    Hirota, Kaoru
    2022 41ST CHINESE CONTROL CONFERENCE (CCC), 2022, : 6353 - 6356
  • [42] Beyond Two-stream: Skeleton-based Three-stream Networks for Action Recognition in Videos
    Xu, Jianfeng
    Tasaka, Kazuyuki
    Yanagihara, Hiromasa
    2018 24TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2018, : 1567 - 1573
  • [43] STA-GCN: two-stream graph convolutional network with spatial-temporal attention for hand gesture recognition
    Zhang, Wei
    Lin, Zeyi
    Cheng, Jian
    Ma, Cuixia
    Deng, Xiaoming
    Wang, Hongan
    VISUAL COMPUTER, 2020, 36 (10-12): : 2433 - 2444
  • [44] Transferable two-stream convolutional neural network for human action recognition
    Xiong, Qianqian
    Zhang, Jianjing
    Wang, Peng
    Liu, Dongdong
    Gao, Robert X.
    JOURNAL OF MANUFACTURING SYSTEMS, 2020, 56 : 605 - 614
  • [45] Two-Stream Spatial–Temporal Transformer Networks for Driver Drowsiness Detection
    Jiang, Qianyi
    Xu, Huahu
    Cheng, Chen
    Journal of Computers (Taiwan), 2023, 34 (05) : 103 - 115
  • [46] Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition
    Shi, Lei
    Zhang, Yifan
    Cheng, Jian
    Lu, Hanqing
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 12018 - 12027
  • [47] Two-Stream Adaptive Attention Graph Convolutional Networks for Action Recognition
    Du Q.
    Xiang Z.
    Tian L.
    Yu L.
    Huanan Ligong Daxue Xuebao/Journal of South China University of Technology (Natural Science), 2022, 50 (12): : 20 - 29
  • [48] Distinct Two-Stream Convolutional Networks for Human Action Recognition in Videos Using Segment-Based Temporal Modeling
    Sarabu, Ashok
    Santra, Ajit Kumar
    DATA, 2020, 5 (04) : 1 - 12
  • [49] Driver Distraction Recognition with Pose-aware Two-stream Convolutional Neural Network
    Tao, Chenghao
    Ma, Sheqiang
    Proceedings of SPIE - The International Society for Optical Engineering, 2023, 12790
  • [50] Two-Stream Temporal Feature Aggregation Based on Clustering for Few-Shot Action Recognition
    Deng, Long
    Li, Ao
    Zhou, Bingxin
    Ge, Yongxin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 2435 - 2439