ESTI: an action recognition network with enhanced spatio-temporal information

被引：1

作者：

Jiang, ZhiYu ^{[1
]}

Zhang, Yi ^{[1
]}

Hu, Shu ^{[1
]}

机构：

[1] Sichuan Univ, Coll Comp Sci, Chengdu 610000, Peoples R China

来源：

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS | 2023年 / 14卷 / 09期

关键词：

Action recognition; Feature enhancement; Global multi-scale feature; Local motion extraction; Spatio-temporal information;

D O I：

10.1007/s13042-023-01820-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Action recognition is an active topic in video understanding, which aims to recognize human actions in videos. The critical step is to model the spatio-temporal information and extract key action clues. To this end, we propose a simple and efficient network (dubbed ESTI) which consists of two core modules. The Local Motion Extraction module highlights the short-term temporal context. While the Global Multi-scale Feature Enhancement module strengthens the spatio-temporal and channel features to model long-term information. By appending ESTI to a 2D ResNet backbone, our network is capable of reasoning different kinds of actions with various amplitudes in videos. Our network is developed under two Geforce RTX 3090 using Python3.7/Pytorch1.8. Extensive experiments have been conducted on 5 mainstream datasets to verify the effectiveness of our network, in which ESTI outperforms most of the state-of-the-arts methods in terms of accuracy, computational cost and network scale. Besides, we also visualize the feature representation of our model by using Grad-Cam to validate its accuracy.

引用

页码：3059 / 3070

页数：12

共 50 条

[31] Projection transform on spatio-temporal context for action recognition
Xu, Wanru
Miao, Zhenjiang
Zhang, Qiang
MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (18) : 7711 - 7728
[32] SKELETON ACTION RECOGNITION BASED ON SPATIO-TEMPORAL FEATURES
Huang, Qian
Xie, Mengting
Li, Xing
Wang, Shuaichen
2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3284 - 3288
[33] Exploiting spatio-temporal knowledge for video action recognition
Zhang, Huigang
Wang, Liuan
Sun, Jun
IET COMPUTER VISION, 2023, 17 (02) : 222 - 230
[34] Spatio-temporal Semantic Features for Human Action Recognition
Liu, Jia
Wang, Xiaonian
Li, Tianyu
Yang, Jie
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2012, 6 (10): : 2632 - 2649
[35] ACTION RECOGNITION USING SPATIO-TEMPORAL DIFFERENTIAL MOTION
Yadav, Gaurav Kumar
Sethi, Amit
2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 3415 - 3419
[36] Interpretable Spatio-temporal Attention for Video Action Recognition
Meng, Lili
Zhao, Bo
Chang, Bo
Huang, Gao
Sun, Wei
Tung, Frederich
Sigal, Leonid
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 1513 - 1522
[37] Spatio-temporal shape and flow correlation for action recognition
Ke, Yan
Sukthankar, Rahul
Hebert, Martial
2007 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-8, 2007, : 3835 - +
[38] Spatio-Temporal Laplacian Pyramid Coding for Action Recognition
Shao, Ling
Zhen, Xiantong
Tao, Dacheng
Li, Xuelong
IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (06) : 817 - 827
[39] Spatio-Temporal Collaborative Module for Efficient Action Recognition
Hao, Yanbin
Wang, Shuo
Tan, Yi
He, Xiangnan
Liu, Zhenguang
Wang, Meng
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7279 - 7291
[40] Spatio-temporal Contrastive Domain Adaptation for Action Recognition
Song, Xiaolin
Zhao, Sicheng
Yang, Jingyu
Yue, Huanjing
Xu, Pengfei
Hu, Runbo
Chai, Hua
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 9782 - 9790

← 1 2 3 4 5 →