Spatial–temporal injection network: exploiting auxiliary losses for action recognition with apparent difference and self-attention

被引:0
|
作者
Haiwen Cao
Chunlei Wu
Jing Lu
Jie Wu
Leiquan Wang
机构
[1] China University of Petroleum,College of Computer Science and Technology
来源
关键词
Action recognition; Apparent difference module; Self-attention mechanism; Spatiotemporal Features;
D O I
暂无
中图分类号
学科分类号
摘要
Two-stream convolutional networks have shown strong performance in action recognition. However, both spatial and temporal features in two-stream are learned separately. There has been almost no consideration for the different characteristics of the spatial and temporal streams, which are performed on the same operations. In this paper, we build upon two-stream convolutional networks and propose a novel spatial–temporal injection network (STIN) with two different auxiliary losses. To build spatial–temporal features as the video representation, the apparent difference module is designed to model the auxiliary temporal constraints on spatial features in spatial injection network. The self-attention mechanism is used to attend to the interested areas in the temporal injection stream, which reduces the optical flow noise influence of uninterested region. Then, these auxiliary losses enable efficient training of two complementary streams which can capture interactions between the spatial and temporal information from different perspectives. Experiments conducted on the two well-known datasets—UCF101 and HMDB51—demonstrate the effectiveness of the proposed STIN.
引用
收藏
页码:1173 / 1180
页数:7
相关论文
共 50 条
  • [41] STGL-GCN: Spatial-Temporal Mixing of Global and Local Self-Attention Graph Convolutional Networks for Human Action Recognition
    Xie, Zhenggui
    Zheng, Gengzhong
    Miao, Liming
    Huang, Wei
    IEEE ACCESS, 2023, 11 : 16526 - 16532
  • [42] Relation-mining self-attention network for skeleton-based human action recognition
    Gedamu, Kumie
    Ji, Yanli
    Gao, LingLing
    Yang, Yang
    Shen, Heng Tao
    PATTERN RECOGNITION, 2023, 139
  • [43] Convolutional Self-attention Guided Graph Neural Network for Few-Shot Action Recognition
    Pan, Fei
    Guo, Jie
    Guo, Yanwen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT II, 2023, 14087 : 401 - 412
  • [44] Integrating Temporal and Spatial Attention for Video Action Recognition
    Zhou, Yuanding
    Li, Baopu
    Wang, Zhihui
    Li, Haojie
    SECURITY AND COMMUNICATION NETWORKS, 2022, 2022
  • [45] Joint spatial-temporal attention for action recognition
    Yu, Tingzhao
    Guo, Chaoxu
    Wang, Lingfeng
    Gu, Huxiang
    Xiang, Shiming
    Pan, Chunhong
    PATTERN RECOGNITION LETTERS, 2018, 112 : 226 - 233
  • [46] Spatial self-attention network with self-attention distillation for fine-grained image recognitionx2729;
    Baffour, Adu Asare
    Qin, Zhen
    Wang, Yong
    Qin, Zhiguang
    Choo, Kim-Kwang Raymond
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 81
  • [47] Masked face recognition with convolutional visual self-attention network
    Ge, Yiming
    Liu, Hui
    Du, Junzhao
    Li, Zehua
    Wei, Yuheng
    NEUROCOMPUTING, 2023, 518 : 496 - 506
  • [48] Self-Attention based Siamese Neural Network recognition Model
    Liu, Yuxing
    Chang, Geng
    Fu, Guofeng
    Wei, Yingchao
    Lan, Jie
    Liu, Jiarui
    2022 34TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC, 2022, : 721 - 724
  • [49] R-STAN: Residual Spatial-Temporal Attention Network for Action Recognition
    Liu, Quanle
    Che, Xiangjiu
    Bie, Mei
    IEEE ACCESS, 2019, 7 : 82246 - 82255
  • [50] Spatial–Temporal gated graph attention network for skeleton-based action recognition
    Mrugendrasinh Rahevar
    Amit Ganatra
    Pattern Analysis and Applications, 2023, 26 (3) : 929 - 939