Efficient spatio-temporal network for action recognition

被引:0
|
作者
Su, Yanxiong [1 ]
Zhao, Qian [1 ]
机构
[1] Shanghai Univ Elect Power, Coll Elect & Informat Engn, Shanghai 201306, Peoples R China
关键词
Spatio-temporal feature; Motion feature learning; Video action recognition; Channel feature;
D O I
10.1007/s11554-024-01541-6
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The input tensor of video data includes temporal, spatial, and channel dimensions, crucial for extracting complementary spatial, temporal, and spatio-temporal features for video action recognition. To efficiently extract and integrate these features, we propose an efficient spatio-temporal module (ESTM) with three pathways dedicated to extracting spatial, temporal, and spatio-temporal features. Each pathway uses the Cross Global Average Pooling (CGAP) module to compress the current dimension, focusing features on the remaining two dimensions. This enhances feature extraction and recognition rates for complex actions. We also introduce a Motion Excitation Module (MEM) to enrich input features by transforming correlations between adjacent frames, reducing computational complexity. Finally, ESTM and MEM are seamlessly integrated into a 2D CNN, forming the efficient spatio-temporal network (ESTN), with minimal impact on network parameters and computational costs. Extensive experiments show that ESTN outperforms state-of-the-art methods on datasets like Something V1 & V2 and HMDB51, validating its effectiveness.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] Spatio-Temporal Collaborative Module for Efficient Action Recognition
    Hao, Yanbin
    Wang, Shuo
    Tan, Yi
    He, Xiangnan
    Liu, Zhenguang
    Wang, Meng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7279 - 7291
  • [2] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [3] STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
    Yang, Zhaoqilin
    An, Gaoyun
    Zhang, Ruichen
    [J]. MATHEMATICS, 2022, 10 (18)
  • [4] ESTI: an action recognition network with enhanced spatio-temporal information
    ZhiYu Jiang
    Yi Zhang
    Shu Hu
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 3059 - 3070
  • [5] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
    Hu, Lizhang
    Xu, Jinhua
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
  • [6] ESTI: an action recognition network with enhanced spatio-temporal information
    Jiang, ZhiYu
    Zhang, Yi
    Hu, Shu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3059 - 3070
  • [7] A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention
    Yang, Qi
    Lu, Tongwei
    Zhou, Huabing
    [J]. ENTROPY, 2022, 24 (03)
  • [8] A fast human action recognition network based on spatio-temporal features
    Xu, Jie
    Song, Rui
    Wei, Haoliang
    Guo, Jinhong
    Zhou, Yifei
    Huang, Xiwei
    [J]. NEUROCOMPUTING, 2021, 441 : 350 - 358
  • [9] MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module
    Zhang, Yi
    [J]. SENSORS, 2022, 22 (17)
  • [10] SPATIO-TEMPORAL SLOWFAST SELF-ATTENTION NETWORK FOR ACTION RECOGNITION
    Kim, Myeongjun
    Kim, Taehun
    Kim, Daijin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 2206 - 2210