Human Action Recognition Research Based on Fusion TS-CNN and LSTM Networks

被引:0
|
作者
Hui Zan
Gang Zhao
机构
[1] Zhejiang Normal University,Key Laboratory of Intelligent Education Technology and Application of Zhejiang Province
[2] Central China Normal University,Faculty of Artificial Intelligence in Education
关键词
Multistream network; Human action recognition; TS-LSTM; CNN-LSTM;
D O I
暂无
中图分类号
学科分类号
摘要
Human action recognition (HAR) technology is currently of significant interest. The traditional HAR methods depend on the time and space of the video stream generally. It requires a mass of training datasets and produces a long response time, failing to simultaneously meet the real-time interaction technical requirements-high accuracy, low delay, and low computational cost. For instance, the duration of a gymnastic action is as short as 0.2 s, from action capture to recognition, and then to the visualization of a three-dimensional character model. Only when the response time of the application system is short enough can it guide synchronous training and accurate evaluation. To reduce the dependence on the amount of video data and meet the HAR technical requirements, this paper proposes a three-stream long-short term memory (TS-CNN-LSTM) framework combining the CNN and LSTM networks. Firstly, human data of color, depth, and skeleton collected by Microsoft Kinect are used as input to reduce the sample sizes. Secondly, heterogeneous convolutional networks are established to reduce computing costs and elevate response time. The experiment results demonstrate the effectiveness of the proposed model on the NTU-RGB + D, reaching the best accuracy of 87.28% in the Cross-subject mode. Compared with the state-of-the-art methods, our method uses 75% of the training sample size, while the complexity of time and space only occupies 67.5% and 73.98% respectively. The response time of one set action recognition is improved by 0.90–1.61 s, which is especially valuable for timely action feedback. The proposed method provides an effective solution for real-time interactive applications which require timely human action recognition results and responses.
引用
收藏
页码:2331 / 2345
页数:14
相关论文
共 50 条
  • [41] 3D-CNN-Based Fused Feature Maps with LSTM Applied to Action Recognition
    Arif, Sheeraz
    Wang, Jing
    Ul Hassan, Tehseen
    Fei, Zesong
    FUTURE INTERNET, 2019, 11 (02):
  • [42] ?-OTDR pattern recognition based on CNN-LSTM
    Wang, Ming
    Feng, Hao
    Qi, Dunzhe
    Du, Lipu
    Sha, Zhou
    OPTIK, 2023, 272
  • [43] Dynamic Gesture Recognition Based on LSTM-CNN
    Wu, Yuheng
    Zheng, Bin
    Zhao, Yongting
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 2446 - 2450
  • [44] Facial Expression Recognition Based on CNN-LSTM
    Liu, Anping
    Yue, Hongjie
    PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON ELECTRONIC INFORMATION TECHNOLOGY AND COMPUTER ENGINEERING, EITCE 2023, 2023, : 486 - 491
  • [45] φ-OTDR Pattern Recognition Based on LSTM-CNN
    Wang Ming
    Sha Zhou
    Feng Hao
    Du Lipu
    Qi Dunzhe
    ACTA OPTICA SINICA, 2023, 43 (05)
  • [46] Human Action Recognition in Video Sequence using Logistic Regression by Features Fusion Approach based on CNN Features
    Ahmad, Tariq
    Wu, Jinsong
    Khan, Imran
    Rahim, Asif
    Khan, Amjad
    INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2021, 12 (11) : 18 - 25
  • [47] 3D Skeletal Human Action Recognition Using a CNN Fusion Model
    Li, Meng
    Sun, Qiumei
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2021, 2021
  • [48] Bidirectional LSTM with saliency-aware 3D-CNN features for human action recognition
    Arif, Sheeraz
    Wang, Jing
    Siddiqui, Adnan
    Hussain, Rashid
    Hussain, Fida
    JOURNAL OF ENGINEERING RESEARCH, 2021, 9 (3A): : 115 - 133
  • [49] Fusion of histogram based features for Human Action Recognition
    Sahoo, Suraj Prakash
    Silambarasi, R.
    Ari, Samit
    2019 5TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING & COMMUNICATION SYSTEMS (ICACCS), 2019, : 1012 - 1016
  • [50] Human action recognition based on multiple feature fusion
    1600, AMSE Press, 16 Avenue Grauge Blanche, Tassin-la-Demi-Lune, 69160, France (60):