Learning Rich Features from RGB-D Images for Object Detection and Segmentation

被引:957
|
作者
Gupta, Saurabh [1 ]
Girshick, Ross [1 ]
Arbelaez, Pablo [2 ]
Malik, Jitendra [1 ]
机构
[1] Univ Calif Berkeley, Berkeley, CA 94720 USA
[2] Univ Ios Andes, Bogota, Colombia
来源
关键词
RGB-D perception; object detection; object segmentation;
D O I
10.1007/978-3-319-10584-0_23
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we study the problem of object detection for RGB-D images using semantically rich image and depth features. We propose a new geocentric embedding for depth images that encodes height above ground and angle with gravity for each pixel in addition to the horizontal disparity. We demonstrate that this geocentric embedding works better than using raw depth images for learning feature representations with convolutional neural networks. Our final object detection system achieves an average precision of 37.3%, which is a 56% relative improvement over existing methods. We then focus on the task of instance segmentation where we label pixels belonging to object instances found by our detector. For this task, we propose a decision forest approach that classifies pixels in the detection window as foreground or background using a family of unary and binary tests that query shape and geocentric pose features. Finally, we use the output from our object detectors in an existing superpixel classification framework for semantic scene segmentation and achieve a 24% relative improvement over current state-of-the-art for the object categories that we study. We believe advances such as those represented in this paper will facilitate the use of perception in fields like robotics.
引用
收藏
页码:345 / 360
页数:16
相关论文
共 50 条
  • [1] HAND AND OBJECT SEGMENTATION FROM RGB-D IMAGES FOR INTERACTION WITH PLANAR SURFACES
    Weber, Henrique
    Jung, Claudio Rosito
    Gelb, Dan
    2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 2984 - 2988
  • [2] 3D-SSD: Learning hierarchical features from RGB-D images for amodal 3D object detection
    Luo, Qianhui
    Ma, Huifang
    Tang, Li
    Wang, Yue
    Xiong, Rong
    NEUROCOMPUTING, 2020, 378 : 364 - 374
  • [3] Learning Coupled Classifiers with RGB images for RGB-D object recognition
    Li, Xiao
    Fang, Min
    Zhang, Ju-Jie
    Wu, Jinqiao
    PATTERN RECOGNITION, 2017, 61 : 433 - 446
  • [4] Learning of perceptual grouping for object segmentation on RGB-D data
    Richtsfeld, Andreas
    Moerwald, Thomas
    Prankl, Johann
    Zillich, Michael
    Vincze, Markus
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2014, 25 (01) : 64 - 73
  • [5] MLBSNet: Mutual Learning and Boosting Segmentation Network for RGB-D Salient Object Detection
    Xia, Chenxing
    Wang, Jingjing
    Ge, Bing
    ELECTRONICS, 2024, 13 (14)
  • [6] Fast Graph-Based Object Segmentation for RGB-D Images
    Toscana, Giorgio
    Rosa, Stefano
    Bona, Basilio
    PROCEEDINGS OF SAI INTELLIGENT SYSTEMS CONFERENCE (INTELLISYS) 2016, VOL 2, 2018, 16 : 42 - 58
  • [7] Unsupervised Segmentation of RGB-D Images
    Deng, Zhuo
    Latecki, Longin Jan
    COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 : 423 - 435
  • [8] Expandable YOLO: 3D Object Detection from RGB-D Images
    Takahashi, Masahiro
    Ji, Yonghoon
    Umeda, Kazunori
    Moro, Alessandro
    2020 21ST INTERNATIONAL CONFERENCE ON RESEARCH AND EDUCATION IN MECHATRONICS (REM), 2020,
  • [9] A salient object detection algorithm based on RGB-D images
    Song, Can
    Wu, Jin
    Deng, Huiping
    Zhu, Lei
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 1692 - 1697
  • [10] Indoor Scene Understanding with RGB-D Images: Bottom-up Segmentation, Object Detection and Semantic Segmentation
    Gupta, Saurabh
    Arbelaez, Pablo
    Girshick, Ross
    Malik, Jitendra
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2015, 112 (02) : 133 - 149