A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention

被引:8
|
作者
Yang, Qi [1 ,2 ]
Lu, Tongwei [1 ,2 ]
Zhou, Huabing [1 ,2 ]
机构
[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Wuhan 430205, Peoples R China
[2] Wuhan Inst Technol, Hubei Key Lab Intelligent Robot, Wuhan 430205, Peoples R China
基金
中国国家自然科学基金;
关键词
temporal modeling; spatio-temporal motion; group convolution; spatial attention;
D O I
10.3390/e24030368
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Temporal modeling is the key for action recognition in videos, but traditional 2D CNNs do not capture temporal relationships well. 3D CNNs can achieve good performance, but are computationally intensive and not well practiced on existing devices. Based on these problems, we design a generic and effective module called spatio-temporal motion network (SMNet). SMNet maintains the complexity of 2D and reduces the computational effort of the algorithm while achieving performance comparable to 3D CNNs. SMNet contains a spatio-temporal excitation module (SE) and a motion excitation module (ME). The SE module uses group convolution to fuse temporal information to reduce the number of parameters in the network, and uses spatial attention to extract spatial information. The ME module uses the difference between adjacent frames to extract feature-level motion patterns between adjacent frames, which can effectively encode motion features and help identify actions efficiently. We use ResNet-50 as the backbone network and insert SMNet into the residual blocks to form a simple and effective action network. The experiment results on three datasets, namely Something-Something V1, Something-Something V2, and Kinetics-400, show that it out performs state-of-the-arts motion recognition networks.
引用
收藏
页数:19
相关论文
共 50 条
  • [11] On the spatial attention in spatio-temporal graph convolutional networks for skeleton-based human action recognition
    Heidari, Negar
    Iosifidis, Alexandros
    [J]. 2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2021,
  • [12] Robust human action recognition based on spatio-temporal descriptors and motion temporal templates
    Dou, Jianfang
    Li, Jianxun
    [J]. OPTIK, 2014, 125 (07): : 1891 - 1896
  • [13] A fast human action recognition network based on spatio-temporal features
    Xu, Jie
    Song, Rui
    Wei, Haoliang
    Guo, Jinhong
    Zhou, Yifei
    Huang, Xiwei
    [J]. NEUROCOMPUTING, 2021, 441 : 350 - 358
  • [14] Human Action Recognition Algorithm Based on Spatio-Temporal Interactive Attention Model
    Pan Na
    Jiang Min
    Kong Jun
    [J]. LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (18)
  • [15] Spatio-Temporal Self-Attention Weighted VLAD Neural Network for Action Recognition
    Cheng, Shilei
    Xie, Mei
    Ma, Zheng
    Li, Siqi
    Gu, Song
    Yang, Feng
    [J]. IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2021, E104D (01) : 220 - 224
  • [16] Unified Spatio-Temporal Attention Networks for Action Recognition in Videos
    Li, Dong
    Yao, Ting
    Duan, Ling-Yu
    Mei, Tao
    Rui, Yong
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2019, 21 (02) : 416 - 428
  • [17] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Lu, Xuemin
    Quan, Wei
    Marek, Reformat
    Zhao, Haiquan
    Chen, Jim X. X.
    [J]. VISUAL COMPUTER, 2024, 40 (05): : 3163 - 3181
  • [18] SiamMAST: Siamese motion-aware spatio-temporal network for video action recognition
    Xuemin Lu
    Wei Quan
    Reformat Marek
    Haiquan Zhao
    Jim X. Chen
    [J]. The Visual Computer, 2024, 40 : 3163 - 3181
  • [19] Spatio-Temporal Adaptive Network With Bidirectional Temporal Difference for Action Recognition
    Li, Zhilei
    Li, Jun
    Ma, Yuqing
    Wang, Rui
    Shi, Zhiping
    Ding, Yifu
    Liu, Xianglong
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (09) : 5174 - 5185
  • [20] Human action recognition in immersive virtual reality based on multi-scale spatio-temporal attention network
    Xiao, Zhiyong
    Chen, Yukun
    Zhou, Xinlei
    He, Mingwei
    Liu, Li
    Yu, Feng
    Jiang, Minghua
    [J]. COMPUTER ANIMATION AND VIRTUAL WORLDS, 2024, 35 (05)