Deep learning network model based on fusion of spatiotemporal features for action recognition

被引:3
|
作者
Yang, Ge [1 ,2 ,3 ]
Zou, Wu-xing [1 ,2 ]
机构
[1] Beijing Normal Univ, Key Lab Intelligent Multimedia Technol, Zhuhai 519087, Peoples R China
[2] Beijing Normal Univ Zhuhai, Adv Inst Nat Sci, Zhuhai 519087, Peoples R China
[3] Peking Univ, Shenzhen Grad Sch, Engn Lab Intelligent Percept Internet Things ELIP, Shenzhen 518055, Peoples R China
关键词
Action recognition; Deep learning; CNN; LSTM;
D O I
10.1007/s11042-022-11937-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In view of the problem that the current deep learning network does not fully extract and fuse spatio-temporal information in the action recognition task, resulting in low recognition accuracy, this paper proposes a deep learning network model based on fusion of spatio-temporal features (FSTFN). Through two networks composed of CNN (Convolutional Neural Networks) and LSTM (Long Short-Term Memory), the time and space information are extracted and fused; multi-segment input is used to process large-scale video frame information to solve the problem of long-term dependence and improve the prediction accuracy; The attention mechanism improves the weight of visual subjects in the network. The experimental verification on the UCF101 (University of Central Florida 101) data set shows that the prediction accuracy of the proposed FSFTN on the UCF101 data set is 92.7%, 4.7% higher than that of Two-stream, which verifies the effectiveness of the network model.
引用
收藏
页码:9875 / 9896
页数:22
相关论文
共 50 条
  • [1] Deep learning network model based on fusion of spatiotemporal features for action recognition
    Ge Yang
    Wu-xing Zou
    [J]. Multimedia Tools and Applications, 2022, 81 : 9875 - 9896
  • [2] Spatiotemporal attention enhanced features fusion network for action recognition
    Zhuang, Danfeng
    Jiang, Min
    Kong, Jun
    Liu, Tianshan
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2021, 12 (03) : 823 - 841
  • [3] Spatiotemporal attention enhanced features fusion network for action recognition
    Danfeng Zhuang
    Min Jiang
    Jun Kong
    Tianshan Liu
    [J]. International Journal of Machine Learning and Cybernetics, 2021, 12 : 823 - 841
  • [4] A deep multimodal network based on bottleneck layer features fusion for action recognition
    Singh, Tej
    Vishwakarma, Dinesh Kumar
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (24) : 33505 - 33525
  • [5] A deep multimodal network based on bottleneck layer features fusion for action recognition
    Tej Singh
    Dinesh Kumar Vishwakarma
    [J]. Multimedia Tools and Applications, 2021, 80 : 33505 - 33525
  • [6] Feature Fusion of Deep Spatial Features and Handcrafted Spatiotemporal Features for Human Action Recognition
    Uddin, Md Azher
    Lee, Young-Koo
    [J]. SENSORS, 2019, 19 (07)
  • [7] A Spatiotemporal Fusion Network For Skeleton-Based Action Recognition
    Bao, Wenxia
    Wang, Junyi
    Yang, Xianjun
    Chen, Hemu
    [J]. 2024 3RD INTERNATIONAL CONFERENCE ON IMAGE PROCESSING AND MEDIA COMPUTING, ICIPMC 2024, 2024, : 347 - 352
  • [8] Human Action Recognition Based on Multiple Features and Modified Deep Learning Model
    Zhu, Shaoping
    Xiao, Yongliang
    Ma, Weimin
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (10)
  • [9] Spatiotemporal information deep fusion network with frame attention mechanism for video action recognition
    Ou, Hongshi
    Sun, Jifeng
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [10] Multi-Layered Deep Learning Features Fusion for Human Action Recognition
    Kiran, Sadia
    Khan, Muhammad Attique
    Javed, Muhammad Younus
    Alhaisoni, Majed
    Tariq, Usman
    Nam, Yunyoung
    Damasevicius, Robertas
    Sharif, Muhammad
    [J]. CMC-COMPUTERS MATERIALS & CONTINUA, 2021, 69 (03): : 4061 - 4075