Performance evaluation of deep feature learning for RGB-D image/video classification

被引:60
|
作者
Shao, Ling [1 ,2 ]
Cai, Ziyun [3 ]
Liu, Li [2 ]
Lu, Ke [4 ,5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
[3] Univ Sheffield, Dept Elect & Elect Engn, Mappin St, Sheffield S1 3JD, S Yorkshire, England
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[5] Beijing Ctr Math & Informat Interdisciplinary Sci, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
Deep neural networks; RGB-D data; Feature learning; Performance evaluation; RECOGNITION;
D O I
10.1016/j.ins.2017.01.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:266 / 283
页数:18
相关论文
共 50 条
  • [41] Object Classification Using Dictionary Learning and RGB-D Covariance Descriptors
    Beksi, William J.
    Papanikolopoulos, Nikolaos
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2015, : 1880 - 1885
  • [42] RGB-D Video based Hand Authentication using Deep Neural Networks
    Miyazaki, Ryogo
    Sasaki, Kazuya
    Tsumura, Norimichi
    Hirai, Keita
    [J]. JOURNAL OF IMAGING SCIENCE AND TECHNOLOGY, 2023, 67 (03)
  • [43] Learning for classification of traffic-related object on RGB-D data
    Yingjie Xia
    Xingmin Shi
    Na Zhao
    [J]. Multimedia Systems, 2017, 23 : 129 - 138
  • [44] Learning for classification of traffic-related object on RGB-D data
    Xia, Yingjie
    Shi, Xingmin
    Zhao, Na
    [J]. MULTIMEDIA SYSTEMS, 2017, 23 (01) : 129 - 138
  • [45] Robust RGB-D Hand Tracking Using Deep Learning Priors
    Sanchez-Riera, Jordi
    Srinivasan, Kathiravan
    Hua, Kai-Lung
    Cheng, Wen-Huang
    Hossain, M. Anwar
    Alhamid, Mohammed F.
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (09) : 2289 - 2301
  • [46] DF2Net: A Discriminative Feature Learning and Fusion Network for RGB-D Indoor Scene Classification
    Li, Yabei
    Zhang, Junge
    Cheng, Yanhua
    Huang, Kaiqi
    Tan, Tieniu
    [J]. THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2018, : 7041 - 7048
  • [47] Deep Learning for Automated Occlusion Edge Detection in RGB-D Frames
    Sarkar, Soumik
    Venugopalan, Vivek
    Reddy, Kishore
    Ryde, Julian
    Jaitly, Navdeep
    Giering, Michael
    [J]. JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2017, 88 (02): : 205 - 217
  • [48] Region Merging Driven by Deep Learning for RGB-D Segmentation and Labeling
    Michieli, Umberto
    Camporese, Maria
    Agiollo, Andrea
    Pagnutti, Giampaolo
    Zanuttigh, Pietro
    [J]. ICDSC 2019: 13TH INTERNATIONAL CONFERENCE ON DISTRIBUTED SMART CAMERAS, 2019,
  • [49] Learning an Intrinsic Image Decomposer Using Synthesized RGB-D Dataset
    Han, Guangyun
    Xie, Xiaohua
    Lai, Jianhuang
    Zheng, Wei-Shi
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2018, 25 (06) : 753 - 757
  • [50] A brief survey on RGB-D semantic segmentation using deep learning*
    Wang, Changshuo
    Wang, Chen
    Li, Weijun
    Wang, Haining
    [J]. DISPLAYS, 2021, 70