A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention

被引:8
|
作者
Yang, Qi [1 ,2 ]
Lu, Tongwei [1 ,2 ]
Zhou, Huabing [1 ,2 ]
机构
[1] Wuhan Inst Technol, Sch Comp Sci & Engn, Wuhan 430205, Peoples R China
[2] Wuhan Inst Technol, Hubei Key Lab Intelligent Robot, Wuhan 430205, Peoples R China
基金
中国国家自然科学基金;
关键词
temporal modeling; spatio-temporal motion; group convolution; spatial attention;
D O I
10.3390/e24030368
中图分类号
O4 [物理学];
学科分类号
0702 ;
摘要
Temporal modeling is the key for action recognition in videos, but traditional 2D CNNs do not capture temporal relationships well. 3D CNNs can achieve good performance, but are computationally intensive and not well practiced on existing devices. Based on these problems, we design a generic and effective module called spatio-temporal motion network (SMNet). SMNet maintains the complexity of 2D and reduces the computational effort of the algorithm while achieving performance comparable to 3D CNNs. SMNet contains a spatio-temporal excitation module (SE) and a motion excitation module (ME). The SE module uses group convolution to fuse temporal information to reduce the number of parameters in the network, and uses spatial attention to extract spatial information. The ME module uses the difference between adjacent frames to extract feature-level motion patterns between adjacent frames, which can effectively encode motion features and help identify actions efficiently. We use ResNet-50 as the backbone network and insert SMNet into the residual blocks to form a simple and effective action network. The experiment results on three datasets, namely Something-Something V1, Something-Something V2, and Kinetics-400, show that it out performs state-of-the-arts motion recognition networks.
引用
收藏
页数:19
相关论文
共 50 条
  • [41] Action recognition using spatio-temporal regularity based features
    Goodhart, Taylor
    Yan, Pingkun
    Shah, Mubarak
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 745 - 748
  • [42] Multimodal human action recognition based on spatio-temporal action representation recognition model
    Qianhan Wu
    Qian Huang
    Xing Li
    [J]. Multimedia Tools and Applications, 2023, 82 : 16409 - 16430
  • [43] Attention-based spatio-temporal dependence learning network
    Ma, Qianli
    Tian, Shuai
    Wei, Jia
    Wang, Jiabing
    Ng, Wing W. Y.
    [J]. INFORMATION SCIENCES, 2019, 503 : 92 - 108
  • [44] Human Action Recognition Based on a Spatio-Temporal Video Autoencoder
    Sousa e Santos, Anderson Carlos
    Pedrini, Helio
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2020, 34 (11)
  • [45] Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition
    Li, Chaolong
    Cui, Zhen
    Zheng, Wenming
    Xu, Chunyan
    Yang, Jian
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 3482 - 3489
  • [46] Transform based spatio-temporal descriptors for human action recognition
    Shao, Ling
    Gao, Ruoyun
    Liu, Yan
    Zhang, Hui
    [J]. NEUROCOMPUTING, 2011, 74 (06) : 962 - 973
  • [47] Spatio-temporal Cuboid Pyramid for Action Recognition using Depth Motion Sequences
    Ji, Xiaopeng
    Cheng, Jun
    Feng, Wei
    [J]. 2016 EIGHTH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTATIONAL INTELLIGENCE (ICACI), 2016, : 208 - 213
  • [48] Integrally Cooperative Spatio-Temporal Feature Representation of Motion Joints for Action Recognition
    Chao, Xin
    Hou, Zhenjie
    Liang, Jiuzhen
    Yang, Tianjin
    [J]. SENSORS, 2020, 20 (18) : 1 - 22
  • [49] Spatial Optimization in Spatio-temporal Motion Planning
    Zhang, Weize
    Yadmellat, Peyman
    Gao, Zhiwei
    [J]. 2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, : 1248 - 1254
  • [50] Spatial-Temporal Convolutional Attention Network for Action Recognition
    Luo, Huilan
    Chen, Han
    [J]. Computer Engineering and Applications, 2023, 59 (09) : 150 - 158