Improved SSD using deep multi-scale attention spatial–temporal features for action recognition

被引:0
|
作者
Shuren Zhou
Jia Qiu
Arun Solanki
机构
[1] Changsha University of Science and Technology,School of Computer and Communication Engineering
[2] Gautam Buddha University,School of Information and Communication Technology
来源
Multimedia Systems | 2022年 / 28卷
关键词
Action recognition; Multi-scale spatial–temporal feature; Attention mechanism;
D O I
暂无
中图分类号
学科分类号
摘要
The biggest difference between video-based action recognition and image-based action recognition is that the former has an extra feature of time dimension. Most methods of action recognition based on deep learning adopt: (1) using 3D convolution to modeling the temporal features; (2) introducing an auxiliary temporal feature, such as optical flow. However, the 3D convolution network usually consumes huge computational resources. The extraction of optical flow requires an extra tedious process with an extra space for storage, and is usually modeled for short-range temporal features. To construct the temporal features better, in this paper we propose a multi-scale attention spatial–temporal features network based on SSD, by means of piecewise on long range of the whole video sequence to sparse sampling of video, using the self-attention mechanism to capture the relation between one frame and the sequence of frames sampled on the entire range of video, making the network notice the representative frames on the sequence. Moreover, the attention mechanism is used to assign different weights to the inter-frame relations representing different time scales, so as to reasoning the contextual relations of actions in the time dimension. Our proposed method achieves competitive performance on two commonly used datasets: UCF101 and HMDB51.
引用
收藏
页码:2123 / 2131
页数:8
相关论文
共 50 条
  • [1] Improved SSD using deep multi-scale attention spatial-temporal features for action recognition
    Zhou, Shuren
    Qiu, Jia
    Solanki, Arun
    [J]. MULTIMEDIA SYSTEMS, 2022, 28 (06) : 2123 - 2131
  • [2] Robust Action Recognition Using Multi-Scale Spatial-Temporal Concatenations of Local Features as Natural Action Structures
    Zhu, Xiaoyuan
    Li, Meng
    Li, Xiaojian
    Yang, Zhiyong
    Tsien, Joe Z.
    [J]. PLOS ONE, 2012, 7 (10):
  • [3] Event Recognition in Unconstrained Video using Multi-Scale Deep Spatial Features
    Umer, Saiyed
    Ghorai, Mrinmoy
    Mohanta, Partha Pratim
    [J]. 2017 NINTH INTERNATIONAL CONFERENCE ON ADVANCES IN PATTERN RECOGNITION (ICAPR), 2017, : 286 - 291
  • [4] Action Recognition in Radio Signals Based on Multi-Scale Deep Features
    Hao, Xiaojun
    Xu, Guangying
    Ma, Hongbin
    Yang, Shuyuan
    [J]. TENTH INTERNATIONAL CONFERENCE ON GRAPHICS AND IMAGE PROCESSING (ICGIP 2018), 2019, 11069
  • [5] Enhanced SSD with interactive multi-scale attention features for object detection
    Shuren Zhou
    Jia Qiu
    [J]. Multimedia Tools and Applications, 2021, 80 : 11539 - 11556
  • [6] Enhanced SSD with interactive multi-scale attention features for object detection
    Zhou, Shuren
    Qiu, Jia
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (08) : 11539 - 11556
  • [7] Hierarchical Multi-scale Attention Networks for action recognition
    Yan, Shiyang
    Smith, Jeremy S.
    Lu, Wenjin
    Zhang, Bailing
    [J]. SIGNAL PROCESSING-IMAGE COMMUNICATION, 2018, 61 : 73 - 84
  • [8] Multi-Scale Spatial-Temporal Integration Convolutional Tube for Human Action Recognition
    Wu, Haoze
    Liu, Jiawei
    Zhu, Xierong
    Wang, Meng
    Zha, Zheng-Jun
    [J]. PROCEEDINGS OF THE TWENTY-NINTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, 2020, : 753 - 759
  • [9] Skeleton Motion Recognition Based on Multi-Scale Deep Spatio-Temporal Features
    Hu, Kai
    Ding, Yiwu
    Jin, Junlan
    Weng, Liguo
    Xia, Min
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (03):
  • [10] SSD with multi-scale feature fusion and attention mechanism
    Liu, Qiang
    Dong, Lijun
    Zeng, Zhigao
    Zhu, Wenqiu
    Zhu, Yanhui
    Meng, Chen
    [J]. SCIENTIFIC REPORTS, 2023, 13 (01):