Embedding Sequential Information into Spatiotemporal Features for Action Recognition

被引:11
|
作者
Ye, Yuancheng [1 ]
Tian, Yingli [1 ,2 ]
机构
[1] CUNY, Grad Ctr, New York, NY 10021 USA
[2] CUNY, City Coll, New York, NY 10021 USA
关键词
D O I
10.1109/CVPRW.2016.142
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we introduce a novel framework for video-based action recognition, which incorporates the sequential information with the spatiotemporal features. Specifically, the spatiotemporal features are extracted from the sliced clips of videos, and then a recurrent neural network is applied to embed the sequential information into the final feature representation of the video. In contrast to most current deep learning methods for the video-based tasks, our framework incorporates both long-term dependencies and spatiotemporal information of the clips in the video. To extract the spatiotemporal features from the clips, both dense trajectories (DT) and a newly proposed 3D neural network, C3D, are applied in our experiments. Our proposed framework is evaluated on the benchmark datasets of UCF101 and HMDB51, and achieves comparable performance compared with the state-of-the-art results.
引用
收藏
页码:1110 / 1118
页数:9
相关论文
共 50 条
  • [31] Features for Action Recognition
    Le T.
    Duc N.H.
    Nguyen C.T.
    Tran M.T.
    Informatica (Slovenia), 2023, 47 (03): : 327 - 334
  • [32] Spatiotemporal saliency for human action recognition
    Oikonomopoulos, A
    Patras, I
    Pantic, M
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 430 - 433
  • [33] Spatiotemporal information deep fusion network with frame attention mechanism for video action recognition
    Ou, Hongshi
    Sun, Jifeng
    JOURNAL OF ELECTRONIC IMAGING, 2019, 28 (02)
  • [34] Learning Spatiotemporal Features for Infrared Action Recognition with 3D Convolutional Neural Networks
    Jiang, Zhuolin
    Rozgic, Viktor
    Adali, Sancar
    2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 309 - 317
  • [35] Sequential Segment Networks for Action Recognition
    Chen, Quan-Qi
    Zhang, Yu-Jin
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (05) : 712 - 716
  • [36] A Variational Information Bottleneck Based Method to Compress Sequential Networks for Human Action Recognition
    Srivastava, Ayush
    Dutta, Oshin
    Gupta, Jigyasa
    Agarwal, Sumeet
    Ap, Prathosh
    2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION WACV 2021, 2021, : 2744 - 2753
  • [37] SPATIOTEMPORAL SALIENCY AND SUB ACTION SEGMENTATION FOR HUMAN ACTION RECOGNITION
    Babu, Abhishek
    Shyna, A.
    2017 8TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2017,
  • [38] Spatiotemporal features of human motion for gait recognition
    Khan, Muhammad Hassan
    Farid, Muhammad Shahid
    Grzegorzek, Marcin
    SIGNAL IMAGE AND VIDEO PROCESSING, 2019, 13 (02) : 369 - 377
  • [39] Spatiotemporal Residual Networks for Video Action Recognition
    Feichtenhofer, Christoph
    Pinz, Axel
    Wildes, Richard P.
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 29 (NIPS 2016), 2016, 29
  • [40] Learning Spatiotemporal Attention for Egocentric Action Recognition
    Lu, Minlong
    Liao, Danping
    Li, Ze-Nian
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 4425 - 4434