A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention

被引：8

作者：

Yang, Qi ^{[1
,2
]}

Lu, Tongwei ^{[1
,2
]}

Zhou, Huabing ^{[1
,2
]}

机构：

[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Wuhan 430205, Peoples R China

[2] Wuhan Inst Technol, Hubei Key Lab Intelligent Robot, Wuhan 430205, Peoples R China

来源：

ENTROPY | 2022年 / 24卷 / 03期

基金：

中国国家自然科学基金;

关键词：

temporal modeling; spatio-temporal motion; group convolution; spatial attention;

D O I：

10.3390/e24030368

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

Temporal modeling is the key for action recognition in videos, but traditional 2D CNNs do not capture temporal relationships well. 3D CNNs can achieve good performance, but are computationally intensive and not well practiced on existing devices. Based on these problems, we design a generic and effective module called spatio-temporal motion network (SMNet). SMNet maintains the complexity of 2D and reduces the computational effort of the algorithm while achieving performance comparable to 3D CNNs. SMNet contains a spatio-temporal excitation module (SE) and a motion excitation module (ME). The SE module uses group convolution to fuse temporal information to reduce the number of parameters in the network, and uses spatial attention to extract spatial information. The ME module uses the difference between adjacent frames to extract feature-level motion patterns between adjacent frames, which can effectively encode motion features and help identify actions efficiently. We use ResNet-50 as the backbone network and insert SMNet into the residual blocks to form a simple and effective action network. The experiment results on three datasets, namely Something-Something V1, Something-Something V2, and Kinetics-400, show that it out performs state-of-the-arts motion recognition networks.

引用

下载

页数：19

共 50 条

[21] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
Lu, Xuemin
Quan, Wei
Marek, Reformat
Zhao, Haiquan
Chen, Jim X. X.
VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181
[22] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
Xuemin Lu
Wei Quan
Reformat Marek
Haiquan Zhao
Jim X. Chen
The Visual Computer, 2024, 40 : 3163 - 3181
[23] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
Li, Zhilei
Li, Jun
Ma, Yuqing
Wang, Rui
Shi, Zhiping
Ding, Yifu
Liu, Xianglong
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
[24] Human action recognition in immersive virtual reality based on multi-scale spatio-temporal attention network
Xiao, Zhiyong
Chen, Yukun
Zhou, Xinlei
He, Mingwei
Liu, Li
Yu, Feng
Jiang, Minghua
COMPUTER ANIMATION AND VIRTUAL WORLDS, 2024, 35 (05)
[25] Human Action Recognition via Spatio-temporal Dual Network Flow and Visual Attention Fusion
Liu Tianliang
Qiao Qingwei
Wan Junwei
Dai Xiubin
Luo Jiebo
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2018, 40 (10) : 2395 - 2401
[26] SPATIO-TEMPORAL MOTION AGGREGATION NETWORK FOR VIDEO ACTION DETECTION
Zhang, Hongcheng
Zhao, Xu
2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2180 - 2184
[27] Action Recognition With Spatio-Temporal Visual Attention on Skeleton Image Sequences
Yang, Zhengyuan
Li, Yuncheng
Yang, Jianchao
Luo, Jiebo
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (08) : 2405 - 2415
[28] ESTI: an action recognition network with enhanced spatio-temporal information
ZhiYu Jiang
Yi Zhang
Shu Hu
International Journal of Machine Learning and Cybernetics, 2023, 14 : 3059 - 3070
[29] A Spatio-Temporal Convolutional Neural Network for Skeletal Action Recognition
Hu, Lizhang
Xu, Jinhua
NEURAL INFORMATION PROCESSING (ICONIP 2017), PT III, 2017, 10636 : 377 - 385
[30] ESTI: an action recognition network with enhanced spatio-temporal information
Jiang, ZhiYu
Zhang, Yi
Hu, Shu
INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 3059 - 3070

← 1 2 3 4 5 →