Learning Spatiotemporal Features using 3DCNN and Convolutional LSTM for Gesture Recognition

被引：167

作者：

Zhang, Liang ^{[1
]}

Zhu, Guangming ^{[1
]}

Shen, Peiyi ^{[1
]}

Song, Juan ^{[1
]}

Shah, Syed Afaq ^{[2
]}

Bennamoun, Mohammed ^{[2
]}

机构：

[1] Xidian Univ, Sch Software, Xian, Shaanxi, Peoples R China

[2] Univ Western Australia, Nedlands, WA, Australia

来源：

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017) | 2017年

基金：

中国博士后科学基金; 中国国家自然科学基金;

关键词：

D O I：

10.1109/ICCVW.2017.369

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Gesture recognition aims at understanding the ongoing human gestures. In this paper, we present a deep architecture to learn spatiotemporal features for gesture recognition. The deep architecture first learns 2D spatiotemporal feature maps using 3D convolutional neural networks (3DCNN) and bidirectional convolutional long-short-term-memory networks (ConvLSTM). The learnt 2D feature maps can encode the global temporal information and local spatial information simultaneously. Then, 2DCNN is utilized further to learn the higher-level spatiotemporal features from the 2D feature maps for the final gesture recognition. The spatiotemporal correlation information is kept through the whole process of feature learning. This makes the deep architecture an effective spatiotemporal feature learner. Experiments on the ChaLearn LAP large-scale isolated gesture dataset (IsoGD) and the Sheffield Kinect Gesture (SKIG) dataset demonstrate the superiority of the proposed deep architecture.

引用

页码：3120 / 3128

页数：9

共 50 条

[41] Convolutional LSTM: A Deep Learning Method for Motion Intention Recognition Based on Spatiotemporal EEG Data
Fang, Zhijie
Wang, Weiqun
Hou, Zeng-Guang
NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 216 - 224
[42] A hybrid cellular automaton model integrated with 3DCNN and LSTM for simulating land use/cover change
Yang, Wei
Zhang, Yu
Hou, Kun
Wang, Xuejing
INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2025, 18 (01)
[43] ResMorCNN Model: Hyperspectral Images Classification Using Residual-Injection Morphological Features and 3DCNN Layers
Esmaeili, Mohammad
Abbasi-Moghadam, Dariush
Sharifi, Alireza
Tariq, Aqil
Li, Qingting
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 219 - 243
[44] 3DCNN predicting brain age using diffusion tensor imaging
Wang, Yuqi
Wen, Jingxi
Xin, Jiang
Zhang, Yunhao
Xie, Hua
Tang, Yan
MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (12) : 3335 - 3344
[45] Lip-Reading Classification of Turkish Digits Using Ensemble Learning Architecture Based on 3DCNN
Erbey, Ali
Barisci, Necaattin
APPLIED SCIENCES-BASEL, 2025, 15 (02):
[46] Growing Memory Network with Random Weight 3DCNN for Continuous Human Action Recognition
Dou, Wenbang
Chin, Wei Hong
Kubota, Naoyuki
2023 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ, 2023,
[47] Activity Recognition Using Temporal Optical Flow Convolutional Features and Multilayer LSTM
Ullah, Amin
Muhammad, Khan
Del Ser, Javier
Baik, Sung Wook
de Albuquerque, Victor Hugo C.
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9692 - 9702
[48] P3CMQA: Single-Model Quality Assessment Using 3DCNN with Profile-Based Features
Takei, Yuma
Ishida, Takashi
BIOENGINEERING-BASEL, 2021, 8 (03):
[49] Pedestrian Detection from Sparse Point-Cloud using 3DCNN
Tatebe, Yoshiki
Deguchi, Daisuke
Kawanishi, Yasutomo
Ide, Ichiro
Murase, Hiroshi
Sakai, Utsushi
2018 INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT), 2018,
[50] Human Activity Recognition Using Robust Spatiotemporal Features and Convolutional Neural Network
Uddin, Md Zia
Khaksar, Weria
Torresen, Jim
2017 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS (MFI), 2017, : 144 - 149

← 1 2 3 4 5 →