ESTI: an action recognition network with enhanced spatio-temporal information

被引:0
|
作者
ZhiYu Jiang
Yi Zhang
Shu Hu
机构
[1] Sichuan University,College of Computer Science
关键词
Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;
D O I
暂无
中图分类号
学科分类号
摘要
Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.
引用
收藏
页码:3059 / 3070
页数:11
相关论文
共 50 条
  • [1] ESTI: an action recognition network with enhanced spatio-temporal information
    Jiang, ZhiYu
    Zhang, Yi
    Hu, Shu
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3059 - 3070
  • [2] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [3] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    [J]. EURASIP Journal on Image and Video Processing, 2016
  • [4] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [5] Skeleton Action Recognition Based on Spatio-temporal Feature Enhanced Graph Convolutional Network
    Cao, Yi
    Wu, Weiguan
    Li, Ping
    Xia, Yu
    Gao, Qingyuan
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 3022 - 3031
  • [6] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [7] Key Spatio-Temporal Energy Information Mapping for Action Recognition
    Chao, Xin
    Hou, Zhenjie
    Kong, Fei
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (19) : 22895 - 22904
  • [8] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    [J]. SYMMETRY-BASEL, 2023, 15 (12):
  • [9] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
    Hu, Lizhang
    Xu, Jinhua
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
  • [10] Action recognition based on spatio-temporal information and nonnegative component representation
    Wang, Jianhong
    Zhang, Xu
    Zhang, Pinzheng
    Jiang, Longyu
    Luo, Limin
    [J]. Dongnan Daxue Xuebao (Ziran Kexue Ban)/Journal of Southeast University (Natural Science Edition), 2016, 46 (04): : 675 - 680