ESTI: an action recognition network with enhanced spatio-temporal information

被引:1
|
作者
Jiang, ZhiYu [1 ]
Zhang, Yi [1 ]
Hu, Shu [1 ]
机构
[1] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China
关键词
Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;
D O I
10.1007/s13042-023-01820-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.
引用
收藏
页码:3059 / 3070
页数:12
相关论文
共 50 条
  • [31] Spatio-temporal shape and flow correlation for action recognition
    Ke, Yan
    Sukthankar, Rahul
    Hebert, Martial
    [J]. 2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 3835 - +
  • [32] Interpretable Spatio-temporal Attention for Video Action Recognition
    Meng, Lili
    Zhao, Bo
    Chang, Bo
    Huang, Gao
    Sun, Wei
    Tung, Frederich
    Sigal, Leonid
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1513 - 1522
  • [33] Spatio-temporal Semantic Features for Human Action Recognition
    Liu, Jia
    Wang, Xiaonian
    Li, Tianyu
    Yang, Jie
    [J]. KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2012, 6 (10): : 2632 - 2649
  • [34] SKELETON ACTION RECOGNITION BASED ON SPATIO-TEMPORAL FEATURES
    Huang, Qian
    Xie, Mengting
    Li, Xing
    Wang, Shuaichen
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3284 - 3288
  • [35] ACTION RECOGNITION USING SPATIO-TEMPORAL DIFFERENTIAL MOTION
    Yadav, Gaurav Kumar
    Sethi, Amit
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 3415 - 3419
  • [36] Exploiting spatio-temporal knowledge for video action recognition
    Zhang, Huigang
    Wang, Liuan
    Sun, Jun
    [J]. IET COMPUTER VISION, 2023, 17 (02) : 222 - 230
  • [37] Spatio-Temporal Collaborative Module for Efficient Action Recognition
    Hao, Yanbin
    Wang, Shuo
    Tan, Yi
    He, Xiangnan
    Liu, Zhenguang
    Wang, Meng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7279 - 7291
  • [38] Spatio-Temporal Attention Networks for Action Recognition and Detection
    Li, Jun
    Liu, Xianglong
    Zhang, Wenxuan
    Zhang, Mingyuan
    Song, Jingkuan
    Sebe, Nicu
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (11) : 2990 - 3001
  • [39] Human Action Recognition Based on Spatio-temporal Features
    Sawant, Nikhil
    Biswas, K. K.
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 357 - 362
  • [40] Spatio-temporal Contrastive Domain Adaptation for Action Recognition
    Song, Xiaolin
    Zhao, Sicheng
    Yang, Jingyu
    Yue, Huanjing
    Xu, Pengfei
    Hu, Runbo
    Chai, Hua
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9782 - 9790