ESTI: an action recognition network with enhanced spatio-temporal information

被引:1
|
作者
Jiang, ZhiYu [1 ]
Zhang, Yi [1 ]
Hu, Shu [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China
关键词
Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;
D O I
10.1007/s13042-023-01820-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.
引用
收藏
页码:3059 / 3070
页数:12
相关论文
共 50 条
  • [41] Spatio-Temporal Attention Networks for Action Recognition and Detection
    Li, Jun
    Liu, Xianglong
    Zhang, Wenxuan
    Zhang, Mingyuan
    Song, Jingkuan
    Sebe, Nicu
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2990 - 3001
  • [42] Human Action Recognition Using Spatio-temporal Classification
    Fang, Chin-Hsien
    Chen, Ju-Chin
    Tseng, Chien-Chung
    Lien, Jenn-Jier James
    COMPUTER VISION - ACCV 2009, PT II, 2010, 5995 : 98 - 109
  • [43] Human Action Recognition Based on Spatio-temporal Features
    Sawant, Nikhil
    Biswas, K. K.
    PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 357 - 362
  • [44] Hierarchical Spatio-Temporal Context Modeling for Action Recognition
    Sun, Ju
    Wu, Xiao
    Yan, Shuicheng
    Cheong, Loong-Fah
    Chua, Tat-Seng
    Li, Jintao
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2004 - +
  • [45] Spatio-Temporal Information for Action Recognition in Thermal Video Using Deep Learning Model
    Srihari, P.
    Harikiran, J.
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2022, 13 (08) : 669 - 680
  • [46] STGFP: information enhanced spatio-temporal graph neural network for traffic flow predictionSTGFP: information enhanced spatio-temporal graph neural network...Q. Li et al.
    Qi Li
    Fan Wang
    Chen Wang
    Applied Intelligence, 2025, 55 (6)
  • [47] STHARNet: spatio-temporal human action recognition network in content based video retrieval
    Sowmyayani, S.
    Rani, P. Arockia Jansi
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 82 (24) : 38051 - 38066
  • [48] STHARNet: spatio-temporal human action recognition network in content based video retrieval
    S. Sowmyayani
    P. Arockia Jansi Rani
    Multimedia Tools and Applications, 2023, 82 : 38051 - 38066
  • [49] DSTC-Net: differential spatio-temporal correlation network for similar action recognition
    Chen, Hongwei
    He, Shiqi
    Chen, Zexi
    MULTIMEDIA SYSTEMS, 2024, 30 (03)
  • [50] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Lu, Xuemin
    Quan, Wei
    Marek, Reformat
    Zhao, Haiquan
    Chen, Jim X. X.
    VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181