Spatial-temporal multiscale feature optimization based two-stream convolutional neural network for action recognition

被引:1
|
作者
Xia, Limin [1 ]
Fu, Weiye [1 ]
机构
[1] Cent South Univ, Sch Automat, Changsha 410083, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; Two-stream network; Attention mechanism; Multiscale features;
D O I
10.1007/s10586-024-04553-w
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Human action recognition is one of the most challenging tasks in computer vision due to background noise interference and video frame redundancy. Therefore, we propose a two-stream Convolutional Neural Network based on Spatial-Temporal Multiscale Feature Optimization (ST-MFO). Specifically, multiscale features generated by a pyramid pooling network are combined with improved coordinate attention, which results in richer feature representation and reduces background noise interference. Meanwhile, we introduce density peak clustering based on a nonlinear kernel function, which can extract more representative key frames. To improve classification efficiency, we also assign varying degrees of attention to key frames through temporal attention. In addition, we propose an attention-based spatial-temporal information interaction module that optimizes temporal and spatial features with complementarity between temporal and spatial information. Experimental results on four benchmark video datasets show that ST-MFO achieves comparable or better performance than state-of-the-art methods.
引用
收藏
页码:11611 / 11626
页数:16
相关论文
共 50 条
  • [31] The Very Deep Multi-stage Two-stream Convolutional Neural Network for Action Recognition
    Gao, Xiuju
    Zhang, Hanling
    PROCEEDINGS OF THE 2016 3RD INTERNATIONAL CONFERENCE ON MECHATRONICS AND INFORMATION TECHNOLOGY (ICMIT), 2016, 49 : 265 - 269
  • [32] Improving human action recognition with two-stream 3D convolutional neural network
    Van-Minh Khong
    Thanh-Hai Tran
    2018 1ST INTERNATIONAL CONFERENCE ON MULTIMEDIA ANALYSIS AND PATTERN RECOGNITION (MAPR), 2018,
  • [33] Human Action Recognition based on Two-Stream Ind Recurrent Neural Network
    Ge Penghua
    Zhi Min
    TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [34] Deep Convolutional Neural Network Based on Two-Stream Convolutional Unit
    Hou Congcong
    He Yuqing
    Jiang Xiaoheng
    Pan Jing
    LASER & OPTOELECTRONICS PROGRESS, 2018, 55 (02)
  • [35] Lightweight Two-Stream Convolutional Neural Network for SAR Target Recognition
    Huang, Xiayuan
    Yang, Qiao
    Qiao, Hong
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2021, 18 (04) : 667 - 671
  • [36] Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos
    Meng, Bo
    Liu, XueJun
    Wang, Xiaolin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (20) : 26901 - 26918
  • [37] Multi-scale spatial-temporal convolutional neural network for skeleton-based action recognition
    Cheng, Qin
    Cheng, Jun
    Ren, Ziliang
    Zhang, Qieshi
    Liu, Jianming
    PATTERN ANALYSIS AND APPLICATIONS, 2023, 26 (03) : 1303 - 1315
  • [38] Human action recognition based on quaternion spatial-temporal convolutional neural network and LSTM in RGB videos
    Bo Meng
    XueJun Liu
    Xiaolin Wang
    Multimedia Tools and Applications, 2018, 77 : 26901 - 26918
  • [39] Two-Stream Convolution Neural Network with Video-stream for Action Recognition
    Dai, Wei
    Chen, Yimin
    Huang, Chen
    Gao, Ming-Ke
    Zhang, Xinyu
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [40] Marine ship target recognition using two-stream symmetric feature fusion convolutional neural network
    Sun, Yi-Yun
    Fan, Zhen
    Dong, Shan-Ling
    Zheng, Rong-Hao
    Lan, Jian
    Kongzhi Lilun Yu Yingyong/Control Theory and Applications, 2022, 39 (11): : 2009 - 2018