Performance evaluation of deep feature learning for RGB-D image/video classification

被引:63
|
作者
Shao, Ling [1 ,2 ]
Cai, Ziyun [3 ]
Liu, Li [2 ]
Lu, Ke [4 ,5 ]
机构
[1] Nanjing Univ Informat Sci & Technol, Coll Elect & Informat Engn, Nanjing 210044, Jiangsu, Peoples R China
[2] Univ East Anglia, Sch Comp Sci, Norwich NR4 7TJ, Norfolk, England
[3] Univ Sheffield, Dept Elect & Elect Engn, Mappin St, Sheffield S1 3JD, S Yorkshire, England
[4] Univ Chinese Acad Sci, Beijing 100049, Peoples R China
[5] Beijing Ctr Math & Informat Interdisciplinary Sci, Beijing, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Deep neural networks; RGB-D data; Feature learning; Performance evaluation; RECOGNITION;
D O I
10.1016/j.ins.2017.01.013
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep Neural Networks for image/video classification have obtained much success in various computer vision applications. Existing deep learning algorithms are widely used on RGB images or video data. Meanwhile, with the development of low-cost RGB-D sensors (such as Microsoft Kinect and Xtion Pro-Live), high-quality RGB-D data can be easily acquired and used to enhance computer vision algorithms [14]. It would be interesting to investigate how deep learning can be employed for extracting and fusing features from RGB-D data. In this paper, after briefly reviewing the basic concepts of RGB-D information and four prevalent deep learning models (i.e., Deep Belief Networks (DBNs), Stacked Denoising Auto-Encoders (SDAE), Convolutional Neural Networks (CNNs) and Long Short Term Memory (LSTM) Neural Networks), we conduct extensive experiments on five popular RGB-D datasets including three image datasets and two video datasets. We then present a detailed analysis about the comparison between the learned feature representations from the four deep learning models. In addition, a few suggestions on how to adjust hyper parameters for learning deep neural networks are made in this paper. According to the extensive experimental results, we believe that this evaluation will provide insights and a deeper understanding of different deep learning algorithms for RGB-D feature extraction and fusion. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:266 / 283
页数:18
相关论文
共 50 条
  • [31] Subset based deep learning for RGB-D object recognition
    Bai, Jing
    Wu, Yan
    Zhang, Junming
    Chen, Fuqiang
    NEUROCOMPUTING, 2015, 165 : 280 - 292
  • [32] Robot Skill Learning based on Interacting with RGB-D Image
    Liu, Dong
    Lu, Binpeng
    Cong, Ming
    2019 9TH IEEE ANNUAL INTERNATIONAL CONFERENCE ON CYBER TECHNOLOGY IN AUTOMATION, CONTROL, AND INTELLIGENT SYSTEMS (IEEE-CYBER 2019), 2019, : 1209 - 1214
  • [33] Bidirectional feature learning network for RGB-D salient object detection
    Niu, Ye
    Zhou, Sanping
    Dong, Yonghao
    Wang, Le
    Wang, Jinjun
    Zheng, Nanning
    PATTERN RECOGNITION, 2024, 150
  • [34] Image Classification Using PSO-SVM and an RGB-D Sensor
    Lopez-Franco, Carlos
    Villavicencio, Luis
    Arana-Daniel, Nancy
    Alanis, Alma Y.
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [35] Unsupervised Joint Feature Learning and Encoding for RGB-D Scene Labeling
    Wang, Anran
    Lu, Jiwen
    Cai, Jianfei
    Wang, Gang
    Cham, Tat-Jen
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (11) : 4459 - 4473
  • [36] An RGB-D Descriptor for Object Classification
    Arican, Erkut
    Aydin, Tarkan
    ROMANIAN JOURNAL OF INFORMATION SCIENCE AND TECHNOLOGY, 2022, 25 (3-4): : 338 - 349
  • [37] Multiclass Fruit Classification of RGB-D Images Using Color and Texture Feature
    Rachmawati, Ema
    Supriana, Iping
    Khodra, Masayu Leylia
    INTELLIGENCE IN THE ERA OF BIG DATA, ICSIIT 2015, 2015, 516 : 257 - 268
  • [38] Semantic RGB-D Image Synthesis
    Li, Shijie
    Li, Rong
    Gall, Juergen
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 944 - 952
  • [39] GENDER RECOGNITION ON RGB-D IMAGE
    Zhang, Xiaoxiong
    Javed, Sajid
    Obeid, Ahmad
    Dias, Jorge
    Werghi, Naoufel
    2020 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2020, : 1836 - 1840
  • [40] RGB-D Heterogeneous Image Feature Fusion for YOLOfuse Apple Detection Model
    Liu, Liqun
    Hao, Pengfei
    AGRONOMY-BASEL, 2023, 13 (12):