Spatial-temporal multiscale feature optimization based two-stream convolutional neural network for action recognition

被引:1
|
作者
Xia, Limin [1 ]
Fu, Weiye [1 ]
机构
[1] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Two-stream network; Attention mechanism; Multiscale features;
D O I
10.1007/s10586-024-04553-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is one of the most challenging tasks in computer vision due to background noise interference and video frame redundancy. Therefore, we propose a two-stream Convolutional Neural Network based on Spatial-Temporal Multiscale Feature Optimization (ST-MFO). Specifically, multiscale features generated by a pyramid pooling network are combined with improved coordinate attention, which results in richer feature representation and reduces background noise interference. Meanwhile, we introduce density peak clustering based on a nonlinear kernel function, which can extract more representative key frames. To improve classification efficiency, we also assign varying degrees of attention to key frames through temporal attention. In addition, we propose an attention-based spatial-temporal information interaction module that optimizes temporal and spatial features with complementarity between temporal and spatial information. Experimental results on four benchmark video datasets show that ST-MFO achieves comparable or better performance than state-of-the-art methods.
引用
收藏
页码:11611 / 11626
页数:16
相关论文
共 50 条
  • [1] Spatial-temporal interaction learning based two-stream network for action recognition
    Liu, Tianyu
    Ma, Yujun
    Yang, Wenhan
    Ji, Wanting
    Wang, Ruili
    Jiang, Ping
    INFORMATION SCIENCES, 2022, 606 : 864 - 876
  • [2] Two-stream spatial-temporal neural networks for pose-based action recognition
    Wang, Zixuan
    Zhu, Aichun
    Hu, Fangqiang
    Wu, Qianyu
    Li, Yifeng
    JOURNAL OF ELECTRONIC IMAGING, 2020, 29 (04)
  • [3] Two-Stream Convolutional Neural Network for Video Action Recognition
    Qiao, Han
    Liu, Shuang
    Xu, Qingzhen
    Liu, Shouqiang
    Yang, Wanggan
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2021, 15 (10): : 3668 - 3684
  • [4] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
    Zheng, Zhenxing
    An, Gaoyun
    Wu, Dapeng
    Ruan, Qiuqi
    NEUROCOMPUTING, 2019, 358 : 446 - 455
  • [5] Two-Stream Adaptive Weight Convolutional Neural Network Based on Spatial Attention for Human Action Recognition
    Chen, Guanzhou
    Yao, Lu
    Xu, Jingting
    Liu, Qianxi
    Chen, Shengyong
    INTELLIGENT ROBOTICS AND APPLICATIONS (ICIRA 2022), PT IV, 2022, 13458 : 319 - 330
  • [6] Transferable two-stream convolutional neural network for human action recognition
    Xiong, Qianqian
    Zhang, Jianjing
    Wang, Peng
    Liu, Dongdong
    Gao, Robert X.
    JOURNAL OF MANUFACTURING SYSTEMS, 2020, 56 : 605 - 614
  • [7] Skeleton-Based Action Recognition Through Contrasting Two-Stream Spatial-Temporal Networks
    Pang, Chen
    Lu, Xuequan
    Lyu, Lei
    IEEE TRANSACTIONS ON MULTIMEDIA, 2023, 25 : 8699 - 8711
  • [8] Human Action Recognition Based on a Two-stream Convolutional Network Classifier
    Silva, Vincius de Oliveira
    Vidal, Flavio de Barros
    Soares Romariz, Alexandre Ricardo
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 774 - 778
  • [9] Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network
    Zhang, Kao
    Chen, Zhenzhong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (12) : 3544 - 3557
  • [10] Improved human action recognition approach based on two-stream convolutional neural network model
    Congcong Liu
    Jie Ying
    Haima Yang
    Xing Hu
    Jin Liu
    The Visual Computer, 2021, 37 : 1327 - 1341