A Context Based Deep Temporal Embedding Network in Action Recognition

被引:2
|
作者
Koohzadi, Maryam [1 ]
Charkari, Nasrollah Moghadam [1 ]
机构
[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran, Iran
关键词
Deep temporal embedding; Self-supervision; Residual technique; Two-step deep method; Long-term temporal representation; ATTENTION;
D O I
10.1007/s11063-020-10248-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long term temporal representation methods demand high computational cost, restricting their practical use in real world applications. We propose a two-step deep residual method for efficiently learning long-term discriminative temporal representation, whilst significantly reducing computational cost. In the first step, a novel self-supervision deep temporal embedding method is presented to embed repetitive short-term motions at a cluster-friendly feature space. In the second step, an efficient temporal representation is made by leveraging the differences between the original data and its associated repetitive motion clusters as a novel deep residual method. Experimental results demonstrate that, the proposed method achieves competitive results on some challenging human action recognition datasets like UCF101, HMDB51, THUMOS14, and Kinetics-400.
引用
收藏
页码:187 / 220
页数:34
相关论文
共 50 条
  • [31] Learning Heterogeneous Spatial-Temporal Context for Skeleton-Based Action Recognition
    Gao, Xuehao
    Yang, Yang
    Wu, Yang
    Du, Shaoyi
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (09) : 12130 - 12141
  • [32] Deep Attention Network for Egocentric Action Recognition
    Lu, Minlong
    Li, Ze-Nian
    Wang, Yueming
    Pan, Gang
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (08) : 3703 - 3713
  • [33] Fine-Grained Action Recognition Based on Temporal Pyramid Excitation Network
    Zhou, Xuan
    Yi, Jianping
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02): : 2103 - 2116
  • [34] A fast human action recognition network based on spatio-temporal features
    Xu, Jie
    Song, Rui
    Wei, Haoliang
    Guo, Jinhong
    Zhou, Yifei
    Huang, Xiwei
    Neurocomputing, 2021, 441 : 350 - 358
  • [35] Temporal Pyramid Pooling-Based Convolutional Neural Network for Action Recognition
    Wang, Peng
    Cao, Yuanzhouhan
    Shen, Chunhua
    Liu, Lingqiao
    Shen, Heng Tao
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2017, 27 (12) : 2613 - 2622
  • [36] Attention-Based Temporal Weighted Convolutional Neural Network for Action Recognition
    Zang, Jinliang
    Wang, Le
    Liu, Ziyi
    Zhang, Qilin
    Niu, Zhenxing
    Hua, Gang
    Zheng, Nanning
    ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS, AIAI 2018, 2018, 519 : 97 - 108
  • [37] Temporal Shift Module-Based Vision Transformer Network for Action Recognition
    Zhang, Kunpeng
    Lyu, Mengyan
    Guo, Xinxin
    Zhang, Liye
    Liu, Cong
    IEEE ACCESS, 2024, 12 : 47246 - 47257
  • [38] Action Recognition Network Based on Local Spatiotemporal Features and Global Temporal Excitation
    Li, Shukai
    Wang, Xiaofang
    Shan, Dongri
    Zhang, Peng
    APPLIED SCIENCES-BASEL, 2023, 13 (11):
  • [39] A Spatio-Temporal Motion Network for Action Recognition Based on Spatial Attention
    Yang, Qi
    Lu, Tongwei
    Zhou, Huabing
    ENTROPY, 2022, 24 (03)
  • [40] A fast human action recognition network based on spatio-temporal features
    Xu, Jie
    Song, Rui
    Wei, Haoliang
    Guo, Jinhong
    Zhou, Yifei
    Huang, Xiwei
    NEUROCOMPUTING, 2021, 441 : 350 - 358