A Context Based Deep Temporal Embedding Network in Action Recognition

被引:2
|
作者
Koohzadi, Maryam [1 ]
Charkari, Nasrollah Moghadam [1 ]
机构
[1] Tarbiat Modares Univ, Dept Elect & Comp Engn, Tehran, Iran
关键词
Deep temporal embedding; Self-supervision; Residual technique; Two-step deep method; Long-term temporal representation; ATTENTION;
D O I
10.1007/s11063-020-10248-1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Long term temporal representation methods demand high computational cost, restricting their practical use in real world applications. We propose a two-step deep residual method for efficiently learning long-term discriminative temporal representation, whilst significantly reducing computational cost. In the first step, a novel self-supervision deep temporal embedding method is presented to embed repetitive short-term motions at a cluster-friendly feature space. In the second step, an efficient temporal representation is made by leveraging the differences between the original data and its associated repetitive motion clusters as a novel deep residual method. Experimental results demonstrate that, the proposed method achieves competitive results on some challenging human action recognition datasets like UCF101, HMDB51, THUMOS14, and Kinetics-400.
引用
收藏
页码:187 / 220
页数:34
相关论文
共 50 条
  • [41] Spatial-temporal pyramid based Convolutional Neural Network for action recognition
    Zheng, Zhenxing
    An, Gaoyun
    Wu, Dapeng
    Ruan, Qiuqi
    NEUROCOMPUTING, 2019, 358 : 446 - 455
  • [42] Temporal Refinement Graph Convolutional Network for Skeleton-Based Action Recognition
    Zhuang T.
    Qin Z.
    Ding Y.
    Deng F.
    Chen L.
    Qin Z.
    Raymond Choo K.-K.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (04): : 1586 - 1598
  • [43] Hierarchical Spatio-Temporal Context Modeling for Action Recognition
    Sun, Ju
    Wu, Xiao
    Yan, Shuicheng
    Cheong, Loong-Fah
    Chua, Tat-Seng
    Li, Jintao
    CVPR: 2009 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-4, 2009, : 2004 - +
  • [44] Projection transform on spatio-temporal context for action recognition
    Xu, Wanru
    Miao, Zhenjiang
    Zhang, Qiang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2015, 74 (18) : 7711 - 7728
  • [45] Projection transform on spatio-temporal context for action recognition
    Wanru Xu
    Zhenjiang Miao
    Qiang Zhang
    Multimedia Tools and Applications, 2015, 74 : 7711 - 7728
  • [46] Action recognition using exemplar-based embedding
    Weinland, Daniel
    Boyer, Edmond
    2008 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOLS 1-12, 2008, : 3033 - 3039
  • [47] Facial Expression Recognition Based on Deep Spatio-Temporal Attention Network
    Li, Shuqin
    Zheng, Xiangwei
    Zhang, Xia
    Chen, Xuanchi
    Li, Wei
    COLLABORATIVE COMPUTING: NETWORKING, APPLICATIONS AND WORKSHARING, COLLABORATECOM 2022, PT II, 2022, 461 : 516 - 532
  • [48] Deep Attributed Network Embedding Based on the PPMI
    Dong, Kunjie
    Huang, Tong
    Zhou, Lihua
    Wang, Lizhen
    Chen, Hongmei
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS: DASFAA 2021 INTERNATIONAL WORKSHOPS, 2021, 12680 : 251 - 266
  • [49] Efficient spatio-temporal network for action recognition
    Su, Yanxiong
    Zhao, Qian
    JOURNAL OF REAL-TIME IMAGE PROCESSING, 2024, 21 (05)
  • [50] Temporal Spiking Recurrent Neural Network for Action Recognition
    Wang, Wei
    Hao, Siyuan
    Wei, Yunchao
    Xia, Shengtao
    Feng, Jiashi
    Sebe, Nicu
    IEEE ACCESS, 2019, 7 : 117165 - 117175