A Robust Approach for Action Recognition Based on Spatio-Temporal Features in RGB-D Sequences

被引:1
|
作者
Ly Quoc Ngoc [1 ]
Vo Hoai Viet [1 ]
Tran Thai Son [1 ]
Pham Minh Hoang [1 ]
机构
[1] VNU HCM, Univ Sci, Dept Comp Vis & Robot, Thu Duc, Ho Chi Minh, Vietnam
关键词
Action Recognition; Depth Sequences; GMM; SVM; Multiple Features; Spatio-Temporal Features;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Recognizing human action is attractive research topic in computer vision since it plays an important role on the applications such as human-computer interaction, intelligent surveillance, human actions retrieval system, health care, smart home, robotics and so on. The availability the low-cost Microsoft Kinect sensor, which can capture real-time high-resolution RGB and visual depth information, has opened an opportunity to significantly increase the capabilities of many automated vision based recognition tasks. In this paper, we propose new framework for action recognition in RGB-D video. We extract spatiotemporal features from RGB-D data that capture both visual, shape and motion information. Moreover, the segmentation technique is applied to present the temporal structure of action. Firstly, we use STIP to detect interest points both of RGB and depth channels. Secondly, we apply HOG3D descriptor for RGB channel and 3DS-HONV descriptor for depth channel. In addition, we also extract HOF2.5D from fusing RGB and Depth to capture human's motion. Thirdly, we divide the video into segments and apply GMM to create feature vectors for each segment. So, we have three feature vectors (HOG3D, 3DS-HONV, and HOF2.5D) that represent for each segment. Next, the max pooling technique is applied to create a final vector for each descriptor. Then, we concatenate the feature vectors from the previous step into the final vector for action representation. Lastly, we use SVM method for classification step. We evaluated our proposed method on three benchmark datasets to demonstrate generalizability. And, the experimental results shown to be more accurate for action recognition compared to the previous works. We obtain overall accuracies of 93.5%, 99.16% and 89.38% with our proposed method on the UTKinect-Action, 3D Action Pairs and MSR-Daily Activity 3D dataset, respectively. These results show that our method is feasible and superior performance over the-state-of-the-art methods on these datasets.
引用
收藏
页码:166 / 177
页数:12
相关论文
共 50 条
  • [1] An Effective Fusion Scheme of Spatio-Temporal Features for Human Action Recognition in RGB-D Video
    Tran, Quang D.
    Ly, Ngoc Q.
    [J]. 2013 INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION AND INFORMATION SCIENCES (ICCAIS), 2013,
  • [2] Spatio-temporal feature extraction and representation for RGB-D human action recognition
    Luo, Jiajia
    Wang, Wei
    Qi, Hairong
    [J]. PATTERN RECOGNITION LETTERS, 2014, 50 : 139 - 148
  • [3] LEARNED SPATIO-TEMPORAL TEXTURE DESCRIPTORS FOR RGB-D HUMAN ACTION RECOGNITION
    Zhai, Zhengyuan
    Fan, Chunxiao
    Ming, Yue
    [J]. COMPUTING AND INFORMATICS, 2018, 37 (06) : 1339 - 1362
  • [4] Action Recognition from RGB-D Data: Comparison and fusion of spatio-temporal handcrafted features and deep strategies
    Asadi-Aghbolaghi, Maryam
    Bertiche, Hugo
    Roig, Vicent
    Kasaei, Shohreh
    Escalera, Sergio
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2017), 2017, : 3179 - 3188
  • [5] Multi-stage Factorized Spatio-Temporal Representation for RGB-D Action and Gesture Recognition
    Ma, Yujun
    Zhou, Benjia
    Wang, Ruili
    Wang, Pichao
    [J]. PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2023, 2023, : 3149 - 3160
  • [6] Unsupervised Learning Spatio-temporal Features for Human Activity Recognition from RGB-D Video Data
    Chen, Guang
    Zhang, Feihu
    Giuliani, Manuel
    Buckl, Christian
    Knoll, Alois
    [J]. SOCIAL ROBOTICS, ICSR 2013, 2013, 8239 : 341 - 350
  • [7] Online view-invariant human action recognition using rgb-d spatio-temporal matrix
    Hsu, Yen-Pin
    Liu, Chengyin
    Chen, Tzu-Yang
    Fu, Li-Chen
    [J]. PATTERN RECOGNITION, 2016, 60 : 215 - 226
  • [8] SKELETON ACTION RECOGNITION BASED ON SPATIO-TEMPORAL FEATURES
    Huang, Qian
    Xie, Mengting
    Li, Xing
    Wang, Shuaichen
    [J]. 2023 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2023, : 3284 - 3288
  • [9] Human Action Recognition Based on Spatio-temporal Features
    Sawant, Nikhil
    Biswas, K. K.
    [J]. PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PROCEEDINGS, 2009, 5909 : 357 - 362
  • [10] Action recognition using spatio-temporal regularity based features
    Goodhart, Taylor
    Yan, Pingkun
    Shah, Mubarak
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 745 - 748