Combining CNN streams of dynamic image and depth data for action recognition

被引:17
|
作者
Singh, Roshan [1 ]
Khurana, Rajat [2 ]
Kushwaha, Alok Kumar Singh [2 ]
Srivastava, Rajeev [1 ]
机构
[1] IIT BHU, Dept Comp Sci & Engn, Varanasi, Uttar Pradesh, India
[2] IKG Punjab Tech Univ, Dept Comp Sci & Engn, Kapurthala, Punjab, India
关键词
Human activity recognition; RGB-D; CNN; VGG; Multi-stream CNN models; Transfer learning; ENSEMBLE;
D O I
10.1007/s00530-019-00645-5
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
RGB-D sensors have been in great demand due to its capability of producing large amount of multimodal data like RGB images and depth maps, useful for better training of deep learning models. In this paper, a deep learning model for recognizing human activities in a video sequence by combining multiple CNN streams has been proposed. The proposed work comprises the use of dynamic images generated from RGB images and depth map for three different dimensions. The proposed model is trained using these four streams on VGG Net for action recognition purpose. Further, it is evaluated and compared with the other state-of-the-art methods available in literature, on three challenging datasets, namely MSR daily Activity, UTD MHAD and CAD 60, in terms of accuracy, error, recall, specificity, precision and f-score. From obtained results, it has been observed that the proposed method outperforms other methods.
引用
收藏
页码:313 / 322
页数:10
相关论文
共 50 条
  • [41] Estimating pose from depth image streams
    Fujimura, K
    Zhu, YD
    Ng-Thow-Hing, V
    2005 5TH IEEE-RAS INTERNATIONAL CONFERENCE ON HUMANOID ROBOTS, 2005, : 154 - 160
  • [42] Combining molecular and cell painting image data for mechanism of action prediction
    Tian, Guangyan
    Harrison, Philip J.
    Sreenivasan, Akshai P.
    Carreras-Puigvert, Jordi
    Spjuth, Ola
    ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES, 2023, 3
  • [43] Ensemble of Classifiers Using CNN and Hand-Crafted Features for Depth-Based Action Recognition
    Trelinski, Jacek
    Kwolek, Bogdan
    ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING, ICAISC 2019, PT II, 2019, 11509 : 91 - 103
  • [44] Action recognition for depth video using multi-view dynamic images
    Xiao, Yang
    Chen, Jun
    Wang, Yancheng
    Cao, Zhiguo
    Zhou, Joey Tianyi
    Bai, Xiang
    INFORMATION SCIENCES, 2019, 480 : 287 - 304
  • [45] Palm Vein Recognition Network Combining Transformer and CNN
    Wu, Kai
    Shen, Wenzhong
    Jia, Dingding
    Liang, Juan
    Computer Engineering and Applications, 2023, 59 (24) : 98 - 109
  • [46] SAR Image Target Recognition Method Combining Multi-Resolution Representation and Complex Domain CNN
    Qiao Liangcai
    LASER & OPTOELECTRONICS PROGRESS, 2020, 57 (24)
  • [47] Inertial Sensor Data to Image Encoding for Human Action Recognition
    Ahmad, Zeeshan
    Khan, Naimul
    IEEE SENSORS JOURNAL, 2021, 21 (09) : 10978 - 10988
  • [48] Static and Dynamic Hand Gesture Recognition in Depth Data Using Dynamic Time Warping
    Plouffe, Guillaume
    Cretu, Ana-Maria
    IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT, 2016, 65 (02) : 305 - 316
  • [49] Feature Weighting in Dynamic Time Warping for Gesture Recognition in Depth Data
    Reyes, Miguel
    Dominguez, Gabriel
    Escalera, Sergio
    2011 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCV WORKSHOPS), 2011,
  • [50] Information theory based pruning for CNN compression and its application to image classification and action recognition
    Phan, Hai-Hong
    Vu, Ngoc-Son
    2019 16TH IEEE INTERNATIONAL CONFERENCE ON ADVANCED VIDEO AND SIGNAL BASED SURVEILLANCE (AVSS), 2019,