ESTI: an action recognition network with enhanced spatio-temporal information

被引:1
|
作者
Jiang, ZhiYu [1 ]
Zhang, Yi [1 ]
Hu, Shu [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China
关键词
Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;
D O I
10.1007/s13042-023-01820-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.
引用
收藏
页码:3059 / 3070
页数:12
相关论文
共 50 条
  • [1] ESTI: an action recognition network with enhanced spatio-temporal information
    ZhiYu Jiang
    Yi Zhang
    Shu Hu
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 3059 - 3070
  • [2] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [3] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    [J]. EURASIP Journal on Image and Video Processing, 2016
  • [4] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    [J]. JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [5] Skeleton Action Recognition Based on Spatio-temporal Feature Enhanced Graph Convolutional Network
    Cao, Yi
    Wu, Weiguan
    Li, Ping
    Xia, Yu
    Gao, Qingyuan
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 3022 - 3031
  • [6] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [7] Key Spatio-Temporal Energy Information Mapping for Action Recognition
    Chao, Xin
    Hou, Zhenjie
    Kong, Fei
    [J]. IEEE SENSORS JOURNAL, 2023, 23 (19) : 22895 - 22904
  • [8] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    [J]. SYMMETRY-BASEL, 2023, 15 (12):
  • [9] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
    Hu, Lizhang
    Xu, Jinhua
    [J]. NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
  • [10] A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention
    Yang, Qi
    Lu, Tongwei
    Zhou, Huabing
    [J]. ENTROPY, 2022, 24 (03)