MEST: An Action Recognition Network with Motion Encoder and Spatio-Temporal Module

被引:0
|
作者
Zhang, Yi [1 ]
机构
[1] Sichuan Univ, Dept Comp Sci, Chengdu 610017, Peoples R China
关键词
action recognition; temporal modeling; key frame; spatio-temporal information;
D O I
10.3390/s22176595
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
As a sub-field of video content analysis, action recognition has received extensive attention in recent years, which aims to recognize human actions in videos. Compared with a single image, video has a temporal dimension. Therefore, it is of great significance to extract the spatio-temporal information from videos for action recognition. In this paper, an efficient network to extract spatio-temporal information with relatively low computational load (dubbed MEST) is proposed. Firstly, a motion encoder to capture short-term motion cues between consecutive frames is developed, followed by a channel-wise spatio-temporal module to model long-term feature information. Moreover, the weight standardization method is applied to the convolution layers followed by batch normalization layers to expedite the training process and facilitate convergence. Experiments are conducted on five public datasets of action recognition, Something-Something-V1 and -V2, Jester, UCF101 and HMDB51, where MEST exhibits competitive performance compared to other popular methods. The results demonstrate the effectiveness of our network in terms of accuracy, computational cost and network scales.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention
    Yang, Qi
    Lu, Tongwei
    Zhou, Huabing
    [J]. ENTROPY, 2022, 24 (03)
  • [2] Spatio-Temporal Collaborative Module for Efficient Action Recognition
    Hao, Yanbin
    Wang, Shuo
    Tan, Yi
    He, Xiangnan
    Liu, Zhenguang
    Wang, Meng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7279 - 7291
  • [3] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [4] ACTION RECOGNITION USING SPATIO-TEMPORAL DIFFERENTIAL MOTION
    Yadav, Gaurav Kumar
    Sethi, Amit
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 3415 - 3419
  • [5] STSM: Spatio-Temporal Shift Module for Efficient Action Recognition
    Yang, Zhaoqilin
    An, Gaoyun
    Zhang, Ruichen
    [J]. MATHEMATICS, 2022, 10 (18)
  • [6] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Lu, Xuemin
    Quan, Wei
    Marek, Reformat
    Zhao, Haiquan
    Chen, Jim X. X.
    [J]. VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181
  • [7] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Xuemin Lu
    Wei Quan
    Reformat Marek
    Haiquan Zhao
    Jim X. Chen
    [J]. The Visual Computer, 2024, 40 : 3163 - 3181
  • [8] Video Action Recognition Based on Spatio-temporal Feature Pyramid Module
    Gong, Suming
    Chen, Ying
    [J]. 2020 13TH INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL INTELLIGENCE AND DESIGN (ISCID 2020), 2020, : 338 - 341
  • [9] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [10] SPATIO-TEMPORAL MOTION AGGREGATION NETWORK FOR VIDEO ACTION DETECTION
    Zhang, Hongcheng
    Zhao, Xu
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2180 - 2184