Learning Spatiotemporal Features using 3DCNN and Convolutional LSTM for Gesture Recognition

被引:167
|
作者
Zhang, Liang [1 ]
Zhu, Guangming [1 ]
Shen, Peiyi [1 ]
Song, Juan [1 ]
Shah, Syed Afaq [2 ]
Bennamoun, Mohammed [2 ]
机构
[1] Xidian Univ, Sch Software, Xian, Shaanxi, Peoples R China
[2] Univ Western Australia, Nedlands, WA, Australia
基金
中国博士后科学基金; 中国国家自然科学基金;
关键词
D O I
10.1109/ICCVW.2017.369
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gesture recognition aims at understanding the ongoing human gestures. In this paper, we present a deep architecture to learn spatiotemporal features for gesture recognition. The deep architecture first learns 2D spatiotemporal feature maps using 3D convolutional neural networks (3DCNN) and bidirectional convolutional long-short-term-memory networks (ConvLSTM). The learnt 2D feature maps can encode the global temporal information and local spatial information simultaneously. Then, 2DCNN is utilized further to learn the higher-level spatiotemporal features from the 2D feature maps for the final gesture recognition. The spatiotemporal correlation information is kept through the whole process of feature learning. This makes the deep architecture an effective spatiotemporal feature learner. Experiments on the ChaLearn LAP large-scale isolated gesture dataset (IsoGD) and the Sheffield Kinect Gesture (SKIG) dataset demonstrate the superiority of the proposed deep architecture.
引用
收藏
页码:3120 / 3128
页数:9
相关论文
共 50 条
  • [41] Convolutional LSTM: A Deep Learning Method for Motion Intention Recognition Based on Spatiotemporal EEG Data
    Fang, Zhijie
    Wang, Weiqun
    Hou, Zeng-Guang
    NEURAL INFORMATION PROCESSING (ICONIP 2019), PT IV, 2019, 1142 : 216 - 224
  • [42] A hybrid cellular automaton model integrated with 3DCNN and LSTM for simulating land use/cover change
    Yang, Wei
    Zhang, Yu
    Hou, Kun
    Wang, Xuejing
    INTERNATIONAL JOURNAL OF DIGITAL EARTH, 2025, 18 (01)
  • [43] ResMorCNN Model: Hyperspectral Images Classification Using Residual-Injection Morphological Features and 3DCNN Layers
    Esmaeili, Mohammad
    Abbasi-Moghadam, Dariush
    Sharifi, Alireza
    Tariq, Aqil
    Li, Qingting
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2024, 17 : 219 - 243
  • [44] 3DCNN predicting brain age using diffusion tensor imaging
    Wang, Yuqi
    Wen, Jingxi
    Xin, Jiang
    Zhang, Yunhao
    Xie, Hua
    Tang, Yan
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (12) : 3335 - 3344
  • [45] Lip-Reading Classification of Turkish Digits Using Ensemble Learning Architecture Based on 3DCNN
    Erbey, Ali
    Barisci, Necaattin
    APPLIED SCIENCES-BASEL, 2025, 15 (02):
  • [46] Growing Memory Network with Random Weight 3DCNN for Continuous Human Action Recognition
    Dou, Wenbang
    Chin, Wei Hong
    Kubota, Naoyuki
    2023 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, FUZZ, 2023,
  • [47] Activity Recognition Using Temporal Optical Flow Convolutional Features and Multilayer LSTM
    Ullah, Amin
    Muhammad, Khan
    Del Ser, Javier
    Baik, Sung Wook
    de Albuquerque, Victor Hugo C.
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2019, 66 (12) : 9692 - 9702
  • [48] P3CMQA: Single-Model Quality Assessment Using 3DCNN with Profile-Based Features
    Takei, Yuma
    Ishida, Takashi
    BIOENGINEERING-BASEL, 2021, 8 (03):
  • [49] Pedestrian Detection from Sparse Point-Cloud using 3DCNN
    Tatebe, Yoshiki
    Deguchi, Daisuke
    Kawanishi, Yasutomo
    Ide, Ichiro
    Murase, Hiroshi
    Sakai, Utsushi
    2018 INTERNATIONAL WORKSHOP ON ADVANCED IMAGE TECHNOLOGY (IWAIT), 2018,
  • [50] Human Activity Recognition Using Robust Spatiotemporal Features and Convolutional Neural Network
    Uddin, Md Zia
    Khaksar, Weria
    Torresen, Jim
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTISENSOR FUSION AND INTEGRATION FOR INTELLIGENT SYSTEMS (MFI), 2017, : 144 - 149