3D Neighborhood Convolution: Learning Depth-Aware Features for RGB-D and RGB Semantic Segmentation

被引:11
|
作者
Chen, Yunlu [1 ]
Mensink, Thomas [2 ]
Gavves, Efstratios [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
[2] Google Res, Amsterdam, Netherlands
关键词
D O I
10.1109/3DV.2019.00028
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A key challenge for RGB-D segmentation is how to effectively incorporate 3D geometric information from the depth channel into 2D appearance features. We propose to model the effective receptive field of 2D convolution based on the scale and locality from the 3D neighborhood. Standard convolutions are local in the image space (u, v), often with a fixed receptive field of 3x3 pixels. We propose to define convolutions local with respect to the corresponding point in the 3D real world space (x, y, z), where the depth channel is used to adapt the receptive field of the convolution, which yields the resulting filters invariant to scale and focusing on the certain range of depth. We introduce 3D Neighborhood Convolution (3DN-Conv), a convolutional operator around 3D neighborhoods. Further, we can use estimated depth to use our RGB-D based semantic segmentation model from RGB input. Experimental results validate that our proposed 3DN-Conv operator improves semantic segmentation, using either ground-truth depth (RGB-D) or estimated depth (RGB).
引用
收藏
页码:173 / 182
页数:10
相关论文
共 50 条
  • [41] Evaluation of Multimodal Semantic Segmentation using RGB-D Data
    Hu, Jiesi
    Zhao, Ganning
    You, Suya
    Kuo, C. C. Jay
    [J]. ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS III, 2021, 11746
  • [42] Learning geodesic-aware local features from RGB-D images
    Potje, Guilherme
    Martins, Renato
    Cadar, Felipe
    Nascimento, Erickson R.
    [J]. COMPUTER VISION AND IMAGE UNDERSTANDING, 2022, 219
  • [43] DEPTH-ADAPTIVE SUPERVOXELS FOR RGB-D VIDEO SEGMENTATION
    Weikersdorfer, David
    Schick, Alexander
    Cremers, Daniel
    [J]. 2013 20TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP 2013), 2013, : 2708 - 2712
  • [44] Learning Rich Features from RGB-D Images for Object Detection and Segmentation
    Gupta, Saurabh
    Girshick, Ross
    Arbelaez, Pablo
    Malik, Jitendra
    [J]. COMPUTER VISION - ECCV 2014, PT VII, 2014, 8695 : 345 - 360
  • [45] Semantic RGB-D Image Synthesis
    Li, Shijie
    Li, Rong
    Gall, Juergen
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 944 - 952
  • [46] Scale-aware network with modality-awareness for RGB-D indoor semantic segmentation
    Zhou, Feng
    Lai, Yu-Kun
    Rosin, Paul L.
    Zhang, Fengquan
    Hu, Yong
    [J]. NEUROCOMPUTING, 2022, 492 : 464 - 473
  • [47] DDaNet: Dual-Path Depth-Aware Attention Network for Fingerspelling Recognition Using RGB-D Images
    Yang, Shih-Hung
    Chen, Wei-Ren
    Huang, Wun-Jhu
    Chen, Yon-Ping
    [J]. IEEE ACCESS, 2021, 9 : 7306 - 7322
  • [48] Unsupervised Segmentation of RGB-D Images
    Deng, Zhuo
    Latecki, Longin Jan
    [J]. COMPUTER VISION - ACCV 2014, PT III, 2015, 9005 : 423 - 435
  • [49] RGB-D Segmentation of Poultry Entrails
    Philipsen, Mark Philip
    Jorgensen, Anders
    Escalera, Sergio
    Moeslund, Thomas B.
    [J]. ARTICULATED MOTION AND DEFORMABLE OBJECTS, 2016, 9756 : 168 - 174
  • [50] Salient Semantic Segmentation Based on RGB-D Camera for Robot Semantic Mapping
    Hu, Lihe
    Zhang, Yi
    Wang, Yang
    Yang, Huan
    Tan, Shuyi
    [J]. APPLIED SCIENCES-BASEL, 2023, 13 (06):