Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN

被引：157

作者：

Shi, Yemin ^{[1
]}

Tian, Yonghong ^{[1
]}

Wang, Yaowei ^{[2
]}

Huang, Tiejun ^{[1
]}

机构：

[1] Peking Univ, Sch Elect Engn & Comp Sci, Cooperat Medianet Innovat Ctr, Natl Engn Lab Video Technol, Beijing 100871, Peoples R China

[2] Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China

来源：

IEEE TRANSACTIONS ON MULTIMEDIA | 2017年 / 19卷 / 07期

基金：

中国国家自然科学基金;

关键词：

Action recognition; sequential deep trajectory descriptor (sDTD); three-stream framework; long-term motion;

D O I：

10.1109/TMM.2017.2666540

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Learning the spatial-temporal representation of motion information is crucial to human action recognition. Nevertheless, most of the existing features or descriptors cannot capture motion information effectively, especially for long-term motion. To address this problem, this paper proposes a long-term motion descriptor called sequential deep trajectory descriptor (sDTD). Specifically, we project dense trajectories into two-dimensional planes, and subsequently a CNN-RNN network is employed to learn an effective representation for long-term motion. Unlike the popular two-stream ConvNets, the sDTD stream is introduced into a three-stream framework so as to identify actions from a video sequence. Consequently, this three-stream framework can simultaneously capture static spatial features, short-term motion, and long-term motion in the video. Extensive experiments were conducted on three challenging datasets: KTH, HMDB51, and UCF101. Experimental results show that our method achieves state-of-the-art performance on the KTH and UCF101 datasets, and is comparable to the state-of-the-art methods on the HMDB51 dataset.

引用

页码：1510 / 1520

页数：11

共 50 条

[31] 3 s-STNet: three-stream spatial-temporal network with appearance and skeleton information learning for action recognition
Fang, Ming
Peng, Siyu
Zhao, Yang
Yuan, Haibo
Hung, Chih-Cheng
Liu, Shuhua
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02): : 1835 - 1848
[32] Dynamic Gesture Recognition Based on Three-Stream Coordinate Attention Network and Knowledge Distillation
Wan, Shanshan
Yang, Lan
Ding, Keliang
Qiu, Dongwei
IEEE ACCESS, 2023, 11 : 50547 - 50559
[33] Human Action Recognition With Trajectory Based Covariance Descriptor In Unconstrained Videos
Wang, Hanli
Yi, Yun
Wu, Jun
MM'15: PROCEEDINGS OF THE 2015 ACM MULTIMEDIA CONFERENCE, 2015, : 1175 - 1178
[34] Deep temporal motion descriptor (DTMD) for human action recognition
Nida, Nudrat
Yousaf, Muhammad Haroon
Irtaza, Aun
Velastin, Sergio A.
TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2020, 28 (03) : 1371 - 1385
[35] Ensemble Three-Stream RGB-S Deep Neural Network for Human Behavior Recognition Under Intelligent Home Service Robot Environments
Byeon, Yeong-Hyeon
Kim, Dohyung
Lee, Jaeyeon
Kwak, Keun-Chang
IEEE ACCESS, 2021, 9 : 73240 - 73250
[36] Three-Stream Network With Bidirectional Self-Attention for Action Recognition in Extreme Low Resolution Videos (vol 26, pg 1187, 2019)
Purwanto, Didik
Pramono, Rizard Renanda Adhi
Chen, Yie-Tarng
Fang, Wen-Hsien
IEEE SIGNAL PROCESSING LETTERS, 2020, 27 : 2188 - 2188
[37] Building roof wireframe extraction from aerial images using a three-stream deep neural network
Esmaeily, Zahra
Rezaeian, Mehdi
JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (01)
[38] A three-stream fusion network for 3D skeleton-based action recognitionA three-stream fusion network for 3D skeleton-based action recognitionM. Fang et al.
Ming Fang
Qi Liu
Jianping Ren
Jie Li
Xinning Du
Shuhua Liu
Multimedia Systems, 2025, 31 (3)
[39] NIRExpNet: Three-Stream 3D Convolutional Neural Network for Near Infrared Facial Expression Recognition
Wu, Zhan
Chen, Tong
Chen, Ying
Zhang, Zhihao
Liu, Guangyuan
APPLIED SCIENCES-BASEL, 2017, 7 (11):
[40] Two-stream Deep Representation for Human Action Recognition
Ghrab, Najla Bouarada
Fendri, Emna
Hammami, Mohamed
FOURTEENTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2021), 2022, 12084

← 1 2 3 4 5 →