Performance evaluation of deep feature learning for RGB-D image/video classification

被引:60
|
作者
Shao, Ling [1 ,2 ]
Cai, Ziyun [3 ]
Liu, Li [2 ]
Lu, Ke [4 ,5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
[3] Univ Sheffield, Dept Elect & Elect Engn, Mappin St, Sheffield S1 3JD, S Yorkshire, England
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[5] Beijing Ctr Math & Informat Interdisciplinary Sci, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Deep neural networks; RGB-D data; Feature learning; Performance evaluation; RECOGNITION;
D O I
10.1016/j.ins.2017.01.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:266 / 283
页数:18
相关论文
共 50 条
  • [1] Unsupervised Feature Learning for RGB-D Image Classification
    Jhuo, I-Hong
    Gao, Shenghua
    Zhuang, Liansheng
    Lee, D. T.
    Ma, Yi
    [J]. COMPUTER VISION - ACCV 2014, PT I, 2015, 9003 : 276 - 289
  • [2] Robust Multiview Feature Learning for RGB-D Image Understanding
    Zha, Zheng-Jun
    Yang, Yang
    Tang, Jinhui
    Wang, Meng
    Chua, Tat-Seng
    [J]. ACM TRANSACTIONS ON INTELLIGENT SYSTEMS AND TECHNOLOGY, 2015, 6 (02)
  • [3] An Empirical Analysis of Deep Feature Learning for RGB-D Object Recognition
    Caglayan, Ali
    Can, Ahmet Burak
    [J]. IMAGE ANALYSIS AND RECOGNITION, ICIAR 2017, 2017, 10317 : 312 - 320
  • [4] GrapesNet: Indian RGB & RGB-D vineyard image datasets for deep learning applications
    Barbole, Dhanashree K.
    Jadhav, Parul M.
    [J]. DATA IN BRIEF, 2023, 48
  • [5] Learning structured group sparse representation for RGB-D image classification
    Tu, Shuqin
    Xue, Yueju
    Liang, Yun
    Zhang, Xiao
    Lin, Huankai
    Guo, Aixia
    [J]. Journal of Information and Computational Science, 2015, 12 (11): : 4357 - 4367
  • [6] Traffic Scene Segmentation Based on RGB-D Image and Deep Learning
    Li, Linhui
    Qian, Bo
    Lian, Jing
    Zheng, Weina
    Zhou, Yafu
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2018, 19 (05) : 1664 - 1669
  • [7] RGB-D Scene Classification via Multi-modal Feature Learning
    Ziyun Cai
    Ling Shao
    [J]. Cognitive Computation, 2019, 11 : 825 - 840
  • [8] RGB-D Scene Classification via Multi-modal Feature Learning
    Cai, Ziyun
    Shao, Ling
    [J]. COGNITIVE COMPUTATION, 2019, 11 (06) : 825 - 840
  • [9] RGB-D Face Recognition via Deep Complementary and Common Feature Learning
    Zhang, Hao
    Han, Hu
    Cui, Jiyun
    Shan, Shiguang
    Chen, Xilin
    [J]. PROCEEDINGS 2018 13TH IEEE INTERNATIONAL CONFERENCE ON AUTOMATIC FACE & GESTURE RECOGNITION (FG 2018), 2018, : 8 - 15
  • [10] Multi-modal deep feature learning for RGB-D object detection
    Xu, Xiangyang
    Li, Yuncheng
    Wu, Gangshan
    Luo, Jiebo
    [J]. PATTERN RECOGNITION, 2017, 72 : 300 - 313