Large-scale gesture recognition with a fusion of RGB-D data based on optical flow and the C3D model

被引:32
|
作者
Li, Yunan [1 ]
Miao, Qiguang [1 ]
Tian, Kuan [1 ]
Fan, Yingying [1 ]
Xu, Xin [1 ]
Ma, Zhenxin [1 ]
Song, Jianfeng [1 ]
机构
[1] Xidian Univ, Xian, Shaanxi, Peoples R China
基金
中国国家自然科学基金;
关键词
Gesture recognition; RGB-D data; Optical flow; 3D Convolutional Neural Networks;
D O I
10.1016/j.patrec.2017.12.003
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Gesture recognition has attracted great attention owing to its applications in many fields such as Human Computer Interaction. However, in video-based gesture recognition, some gesture-irrelevant factors like the background handicap the improvement of recognition rate. In this paper, we propose an effective 3D Convolutional Neural Network based method for large-scale gesture recognition using RGB-D video data. To obtain compact but with sufficient motion path information data for the network, the inputs are unified into 32-frame videos first. Then the optical flow images are constructed from the RGB videos frame by frame, to help with eliminating the disturbing background inside them. After that, the spatiotemporal features of de-background RGB and depth data are extracted with the C3D model (a 3D CNN model) respectively and blended together in the next stage according to the discriminant correlation analysis to boost the performance. Finally the classes are predicted with a linear SVM classifier. Our proposed method achieves 54.50% accuracy on the validation subset and 60.93% on the testing subset of the Chalearn LAP IsoGD dataset, both of which outperform our results (ranked 1st place) in the Chalearn LAP Large-scale Gesture Recognition Challenge. (C) 2017 Published by Elsevier B.V.
引用
收藏
页码:187 / 194
页数:8
相关论文
共 50 条
  • [31] ND voxel localization using large-scale 3D environmental map and RGB-D camera
    Oishi, Shuji
    Jeong, Yongjin
    Kurazume, Ryo
    Iwashita, Yumi
    Hasegawa, Tsutomu
    2013 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS (ROBIO), 2013, : 538 - 545
  • [32] Using Appearance-Based Hand Features For Dynamic RGB-D Gesture Recognition
    Chen, Xi
    Koskela, Markus
    2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, : 411 - 416
  • [33] SVM and RGB-D Sensor Based Gesture Recognition for UAV Control<bold> </bold>
    Aguilar, Wilbert G.
    Cobena, Bryan
    Rodriguez, Guillermo
    Salcedo, Vinicio S.
    Collaguazo, Brayan
    AUGMENTED REALITY, VIRTUAL REALITY, AND COMPUTER GRAPHICS, AVR 2018, PT II, 2018, 10851 : 713 - 719
  • [34] Unsupervised Learning Based Static Hand Gesture Recognition from RGB-D Sensor
    Verma, Bindu
    Choudhary, Ayesha
    PROCEEDINGS OF THE EIGHTH INTERNATIONAL CONFERENCE ON SOFT COMPUTING AND PATTERN RECOGNITION (SOCPAR 2016), 2018, 614 : 304 - 314
  • [35] Large Scale Indoor 3D Mapping Using RGB-D Sensor
    Zhu, Xiaoxiao
    Cao, Qixin
    Yokoi, Hiroshi
    Jiang, Yinlai
    INTELLIGENT ROBOTICS AND APPLICATIONS, ICIRA 2016, PT I, 2016, 9834 : 313 - 321
  • [36] Abnormal behavior recognition based on feature fusion C3D network
    Deng, Lujuan
    Fu, Ruochong
    Sun, Qian
    Jiang, Min
    Li, Zuhe
    Chen, Hui
    Yu, Zeqi
    Bu, Xiangzhou
    JOURNAL OF ELECTRONIC IMAGING, 2023, 32 (02)
  • [37] Multiple Kernel Learning and Optical Flow for Action Recognition in RGB-D Video
    Viet, Vo Hoai
    Ngoc, Ly Quoc
    Son, Tran Thai
    Hoang, Pham Minh
    2015 SEVENTH INTERNATIONAL CONFERENCE ON KNOWLEDGE AND SYSTEMS ENGINEERING (KSE), 2015, : 222 - 227
  • [38] Using GNG on 3D Object Recognition in Noisy RGB-D data
    Rangel, Jose Carlos
    Morell, Vicente
    Cazorla, Miguel
    Orts-Escolano, Sergio
    Garcia-Rodriguez, Jose
    2015 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2015,
  • [39] A Large-Scale Hierarchical Multi-View RGB-D Object Dataset
    Lai, Kevin
    Bo, Liefeng
    Ren, Xiaofeng
    Fox, Dieter
    2011 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2011, : 1817 - 1824
  • [40] Human Action Recognition from RGB-D Frames Based on Real-Time 3D Optical Flow Estimation
    Ballin, Gioia
    Munaro, Matteo
    Menegatti, Emanuele
    BIOLOGICALLY INSPIRED COGNITIVE ARCHITECTURES 2012, 2013, 196 : 65 - 74