ESTI: an action recognition network with enhanced spatio-temporal information

被引:1
|
作者
Jiang, ZhiYu [1 ]
Zhang, Yi [1 ]
Hu, Shu [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China
关键词
Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;
D O I
10.1007/s13042-023-01820-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.
引用
收藏
页码:3059 / 3070
页数:12
相关论文
共 50 条
  • [1] ESTI: an action recognition network with enhanced spatio-temporal information
    ZhiYu Jiang
    Yi Zhang
    Shu Hu
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 3059 - 3070
  • [2] Spatio-temporal information for human action recognition
    Yao, Li
    Liu, Yunjian
    Huang, Shihui
    EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2016,
  • [3] Spatio-temporal information for human action recognition
    Li Yao
    Yunjian Liu
    Shihui Huang
    EURASIP Journal on Image and Video Processing, 2016
  • [4] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [5] Skeleton Action Recognition Based on Spatio-temporal Feature Enhanced Graph Convolutional Network
    Cao, Yi
    Wu, Weiguan
    Li, Ping
    Xia, Yu
    Gao, Qingyuan
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2023, 45 (08) : 3022 - 3031
  • [6] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [7] Key Spatio-Temporal Energy Information Mapping for Action Recognition
    Chao, Xin
    Hou, Zhenjie
    Kong, Fei
    IEEE SENSORS JOURNAL, 2023, 23 (19) : 22895 - 22904
  • [8] Spatio-Temporal Information Fusion and Filtration for Human Action Recognition
    Zhang, Man
    Li, Xing
    Wu, Qianhan
    SYMMETRY-BASEL, 2023, 15 (12):
  • [9] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
    Hu, Lizhang
    Xu, Jinhua
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
  • [10] STCA: an action recognition network with spatio-temporal convolution and attention
    Qiuhong Tian
    Weilun Miao
    Lizao Zhang
    Ziyu Yang
    Yang Yu
    Yanying Zhao
    Lan Yao
    International Journal of Multimedia Information Retrieval, 2025, 14 (1)