Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN

被引:157
|
作者
Shi, Yemin [1 ]
Tian, Yonghong [1 ]
Wang, Yaowei [2 ]
Huang, Tiejun [1 ]
机构
[1] Peking Univ, Sch Elect Engn & Comp Sci, Cooperat Medianet Innovat Ctr, Natl Engn Lab Video Technol, Beijing 100871, Peoples R China
[2] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China
基金
中国国家自然科学基金;
关键词
Action recognition; sequential deep trajectory descriptor (sDTD); three-stream framework; long-term motion;
D O I
10.1109/TMM.2017.2666540
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Learning the spatial-temporal representation of motion information is crucial to human action recognition. Nevertheless, most of the existing features or descriptors cannot capture motion information effectively, especially for long-term motion. To address this problem, this paper proposes a long-term motion descriptor called sequential deep trajectory descriptor (sDTD). Specifically, we project dense trajectories into two-dimensional planes, and subsequently a CNN-RNN network is employed to learn an effective representation for long-term motion. Unlike the popular two-stream ConvNets, the sDTD stream is introduced into a three-stream framework so as to identify actions from a video sequence. Consequently, this three-stream framework can simultaneously capture static spatial features, short-term motion, and long-term motion in the video. Extensive experiments were conducted on three challenging datasets: KTH, HMDB51, and UCF101. Experimental results show that our method achieves state-of-the-art performance on the KTH and UCF101 datasets, and is comparable to the state-of-the-art methods on the HMDB51 dataset.
引用
收藏
页码:1510 / 1520
页数:11
相关论文
共 50 条
  • [41] Multiple stream deep learning model for human action recognition
    Gu, Ye
    Ye, Xiaofeng
    Sheng, Weihua
    Ou, Yongsheng
    Li, Yongqiang
    IMAGE AND VISION COMPUTING, 2020, 93
  • [42] Action Recognition Using Multi-stream 2D CNN with Deep Learning-Based Temporal Modality
    Kang, Keonwoo
    Park, Sangwoo
    Park, Hasil
    Kang, Donggoo
    Paik, Joonki
    2023 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS, ICCE, 2023,
  • [43] Typhoon Trajectory Prediction by Three CNN+ Deep-Learning Approaches
    Lin, Gang
    Liang, Yanchun
    Tavares, Adriano
    Lima, Carlos
    Xia, Dong
    ELECTRONICS, 2024, 13 (19)
  • [44] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
    Zhang, He
    Yin, Lu
    Zhang, Hanling
    Wu, Xuesong
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3761 - 3771
  • [45] Facial micro-expression recognition using three-stream vision transformer network with sparse sampling and relabeling
    He Zhang
    Lu Yin
    Hanling Zhang
    Xuesong Wu
    Signal, Image and Video Processing, 2024, 18 : 3761 - 3771
  • [46] Binary dense sift flow based two stream CNN for human action recognition
    Park, Sang Kyoo
    Chung, Jun Ho
    Kang, Tae Koo
    Lim, Myo Taeg
    MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (28-29) : 35697 - 35720
  • [47] LEARNING GEOMETRIC FEATURES WITH DUAL - STREAM CNN FOR 3D ACTION RECOGNITION
    Thien Huynh-The
    Hua, Cam-Hao
    Nguyen Anh Tu
    Kim, Dong-Seong
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 2353 - 2357
  • [48] DKD–DAD: a novel framework with discriminative kinematic descriptor and deep attention-pooled descriptor for action recognition
    Ming Tong
    Mingyang Li
    He Bai
    Lei Ma
    Mengao Zhao
    Neural Computing and Applications, 2020, 32 : 5285 - 5302
  • [49] Two-Stream RNN/CNN for Action Recognition in 3D Videos
    Zhao, Rui
    Ali, Haider
    van der Smagt, Patrick
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4260 - 4267
  • [50] Binary dense sift flow based two stream CNN for human action recognition
    Sang Kyoo Park
    Jun Ho Chung
    Tae Koo Kang
    Myo Taeg Lim
    Multimedia Tools and Applications, 2021, 80 : 35697 - 35720