Two-stream spatial-temporal neural networks for pose-based action recognition

被引：2

作者：

Wang, Zixuan ^{[1
]}

Zhu, Aichun ^{[1
,2
]}

Hu, Fangqiang ^{[1
]}

Wu, Qianyu ^{[1
]}

Li, Yifeng ^{[1
]}

机构：

[1] Nanjing Tech Univ, Sch Comp Sci & Technol, Nanjing, Peoples R China

[2] China Univ Min & Technol, Sch Informat & Control Engn, Xuzhou, Jiangsu, Peoples R China

来源：

JOURNAL OF ELECTRONIC IMAGING | 2020年 / 29卷 / 04期

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

action recognition; pose estimation; convolutional neural network; long short-term memory;

D O I：

10.1117/1.JEI.29.4.043025

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

With recent advances in human pose estimation and human skeleton capture systems, pose-based action recognition has drawn lots of attention among researchers. Although most existing action recognition methods are based on convolutional neural network and long short-term memory, which present outstanding performance, one of the shortcomings of these methods is that they lack the ability to explicitly exploit the rich spatial-temporal information between the skeletons in the behavior, so they are not conducive to improving the accuracy of action recognition. To better address this issue, the two-stream spatial-temporal neural networks for pose-based action recognition is introduced. First, the pose features that are extracted from the raw video are processed by an action modeling module. Then, the temporal information and the spatial information, in the form of relative speed and relative distance, are fed into the temporal neural network and the spatial neural network, respectively. Afterward, the outputs of two-stream networks are fused for better action recognition. Finally, we perform comprehensive experiments on the SUB-JHMDB, SYSU, MPII-Cooking, and NTU RGB+D datasets, the results of which demonstrate the effectiveness of the proposed model. (C) 2020 SPIE and IS&T

引用

页数：16

共 50 条

[21] Two-stream spatiotemporal networks for skeleton action recognition
Wang, Lei
Zhang, Jianwei
Yang, Shanmin
Gu, Song
IET IMAGE PROCESSING, 2023, 17 (11) : 3358 - 3370
[22] Human Activities Recognition Based on Two-stream NonLocal Spatial Temporal Residual Convolution Neural Network
Qian H.
Chen S.
Huangfu X.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (03): : 1100 - 1108
[23] Two-Stream Adaptive Weight Convolutional Neural Network Based on Spatial Attention for Human Action Recognition
Chen, Guanzhou
Yao, Lu
Xu, Jingting
Liu, Qianxi
Chen, Shengyong
INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT IV, 2022, 13458 : 319 - 330
[24] Direction-guided two-stream convolutional neural networks for skeleton-based action recognition
Benyue Su
Peng Zhang
Manzhen Sun
Min Sheng
Soft Computing, 2023, 27 : 11833 - 11842
[25] Direction-guided two-stream convolutional neural networks for skeleton-based action recognition
Su, Benyue
Zhang, Peng
Sun, Manzhen
Sheng, Min
SOFT COMPUTING, 2023, 27 (16) : 11833 - 11842
[26] Human Action Recognition based on Two-Stream Ind Recurrent Neural Network
Ge Penghua
Zhi Min
TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
[27] Two-Stream Collaborative Learning With Spatial-Temporal Attention for Video Classification
Peng, Yuxin
Zhao, Yunzhen
Zhang, Junchao
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (03) : 773 - 786
[28] Two-Stream Convolutional Neural Network for Video Action Recognition
Qiao, Han
Liu, Shuang
Xu, Qingzhen
Liu, Shouqiang
Yang, Wanggan
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (10): : 3668 - 3684
[29] Human action recognition using two-stream attention based LSTM networks
Dai, Cheng
Liu, Xingang
Lai, Jinfeng
APPLIED SOFT COMPUTING, 2020, 86
[30] Human Action Recognition by Fusion of Convolutional Neural Networks and spatial-temporal Information
Li, Weisheng
Ding, Yahui
8TH INTERNATIONAL CONFERENCE ON INTERNET MULTIMEDIA COMPUTING AND SERVICE (ICIMCS2016), 2016, : 255 - 259

← 1 2 3 4 5 →