Depth Pooling Based Large-Scale 3-D Action Recognition With Convolutional Neural Networks

被引:125
|
作者
Wang, Pichao [1 ]
Li, Wanqing [1 ]
Gao, Zhimin [1 ]
Tang, Chang [2 ]
Ogunbona, Philip O. [1 ]
机构
[1] Univ Wollongong, Adv Multimedia Res Lab, Wollongong, NSW 2522, Australia
[2] China Univ Geosci, Sch Comp Sci, Wuhan 430074, Hubei, Peoples R China
关键词
Large-scale; depth; action recognition; convolutional neural networks; GESTURE RECOGNITION;
D O I
10.1109/TMM.2018.2818329
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as dynamic depth images (DDI), dynamic depth normal images (DDNI), and dynamic depth motion normal images (DDMNI), for both isolated and continuous action recognition. These dynamic images are constructed from a segmented sequence of depth maps using hierarchical bidirectional rank pooling to effectively capture the spatial-temporal information. Specifically, DDI exploits the dynamics of postures over time, and DDNI and DDMNI exploit the 3-D structural information captured by depth maps. Upon the proposed representations, a convolutional neural network (ConvNet)-based method is developed for action recognition. The image-based representations enable us to fine-tune the existing ConvNet models trained on image data without training a large number of parameters from scratch. The proposed method achieved the state-of-art results on three large datasets, namely, the large-scale continuous gesture recognition dataset (means the Jaccard index 0.4109), the large-scale isolated gesture recognition dataset (59.21%), and the NTU RGB+D dataset (87.08% cross-subject and 84.22% cross-view) even though only the depth modality was used.
引用
收藏
页码:1051 / 1061
页数:11
相关论文
共 50 条
  • [1] SPATIOTEMPORAL PYRAMID POOLING IN 3D CONVOLUTIONAL NEURAL NETWORKS FOR ACTION RECOGNITION
    Cheng, Cheng
    Lv, Pin
    Su, Bing
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 3468 - 3472
  • [2] Large-scale Multimodal Gesture Segmentation and Recognition based on Convolutional Neural Networks
    Wang, Huogen
    Wang, Pichao
    Song, Zhanjie
    Li, Wanqing
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 3138 - 3146
  • [3] Stratified pooling based deep convolutional neural networks for human action recognition
    Yu, Sheng
    Cheng, Yun
    Su, Songzhi
    Cai, Guorong
    Li, Shaozi
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2017, 76 (11) : 13367 - 13382
  • [4] Stratified pooling based deep convolutional neural networks for human action recognition
    Sheng Yu
    Yun Cheng
    Songzhi Su
    Guorong Cai
    Shaozi Li
    [J]. Multimedia Tools and Applications, 2017, 76 : 13367 - 13382
  • [5] Convolutional Neural Networks with Generalized Attentional Pooling for Action Recognition
    Wang, Yunfeng
    Zhou, Wengang
    Zhang, Qilin
    Li, Houqiang
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (IEEE VCIP), 2018,
  • [6] Large-scale Continuous Gesture Recognition Using Convolutional Neural Networks
    Wang, Pichao
    Li, Wanqing
    Liu, Song
    Zhang, Yuyao
    Gao, Zhimin
    Ogunbona, Philip
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 13 - 18
  • [7] Large-scale Isolated Gesture Recognition Using Convolutional Neural Networks
    Wang, Pichao
    Li, Wanqing
    Liu, Song
    Gao, Zhimin
    Tang, Chang
    Ogunbona, Philip
    [J]. 2016 23RD INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2016, : 7 - 12
  • [8] Large-Scale Shape Retrieval with Sparse 3D Convolutional Neural Networks
    Notchenko, Alexandr
    Kapushev, Yermek
    Burnaev, Evgeny
    [J]. ANALYSIS OF IMAGES, SOCIAL NETWORKS AND TEXTS, AIST 2017, 2018, 10716 : 245 - 254
  • [9] On the Large-Scale Transferability of Convolutional Neural Networks
    Zheng, Liang
    Zhao, Yali
    Wang, Shengjin
    Wang, Jingdong
    Yang, Yi
    Tian, Qi
    [J]. TRENDS AND APPLICATIONS IN KNOWLEDGE DISCOVERY AND DATA MINING: PAKDD 2018 WORKSHOPS, 2018, 11154 : 27 - 39
  • [10] 3D-based Deep Convolutional Neural Network for action recognition with depth sequences
    Liu, Zhi
    Zhang, Chenyang
    Tian, Yingli
    [J]. IMAGE AND VISION COMPUTING, 2016, 55 : 93 - 100