ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation

被引:58
|
作者
Cao, Jinming [1 ]
Leng, Hanchao [1 ]
Lischinski, Dani [2 ]
Cohen-Or, Danny [3 ]
Tu, Changhe [1 ]
Li, Yangyan [4 ]
机构
[1] Shandong Univ, Jinan, Peoples R China
[2] Hebrew Univ Jerusalem, Jerusalem, Israel
[3] Tel Aviv Univ, Tel Aviv, Israel
[4] Alibaba Grp, Hangzhou, Peoples R China
基金
美国国家科学基金会;
关键词
D O I
10.1109/ICCV48922.2021.00700
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-D semantic segmentation has attracted increasing attention over the past few years. Existing methods mostly employ homogeneous convolution operators to consume the RGB and depth features, ignoring their intrinsic differences. In fact, the RGB values capture the photometric appearance properties in the projected image space, while the depth feature encodes both the shape of a local geometry as well as the base (whereabout) of it in a larger context. Compared with the base, the shape probably is more inherent and has a stronger connection to the semantics, and thus is more critical for segmentation accuracy. Inspired by this observation, we introduce a Shape-aware Convolutional layer (ShapeConv) for processing the depth feature, where the depth feature is firstly decomposed into a shape-component and a base-component, next two learnable weights are introduced to cooperate with them independently, and finally a convolution is applied on the re-weighted combination of these two components. ShapeConv is model-agnostic and can be easily integrated into most CNNs to replace vanilla convolutional layers for semantic segmentation. Extensive experiments on three challenging indoor RGB-D semantic segmentation benchmarks, i.e., NYU-Dv2(-13,-40), SUN RGB-D, and SID, demonstrate the effectiveness of our ShapeConv when employing it over five popular architectures. Moreover, the performance of CNNs with ShapeConv is boosted without introducing any computation and memory increase in the inference phase. The reason is that the learnt weights for balancing the importance between the shape and base components in ShapeConv become constants in the inference phase, and thus can be fused into the following convolution, resulting in a network that is identical to one with vanilla convolutional layers.
引用
收藏
页码:7068 / 7077
页数:10
相关论文
共 50 条
  • [1] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
    Duan, Li-Juan
    Sun, Qi-Chao
    Qiao, Yuan-Hua
    Chen, Jun-Cheng
    Cui, Guo-Qin
    [J]. Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
  • [2] Review on Indoor RGB-D Semantic Segmentation with Deep Convolutional Neural Networks
    Barchid, Sami
    Mennesson, Jose
    Djeraba, Chaabane
    [J]. 2021 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2021, : 199 - 202
  • [3] Regularized Fully Convolutional Networks for RGB-D Semantic Segmentation
    Su, Wen
    Wang, Zengfu
    [J]. 2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
  • [4] Scale-aware network with modality-awareness for RGB-D indoor semantic segmentation
    Zhou, Feng
    Lai, Yu-Kun
    Rosin, Paul L.
    Zhang, Fengquan
    Hu, Yong
    [J]. NEUROCOMPUTING, 2022, 492 : 464 - 473
  • [5] Accurate semantic segmentation of RGB-D images for indoor navigation
    Sharan, Sudeep
    Nauth, Peter
    Dominguez-Jimenez, Juan-Jose
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [6] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
    Seichter, Daniel
    Koehler, Mona
    Lewandowski, Benjamin
    Wengefeld, Tim
    Gross, Horst-Michael
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13525 - 13531
  • [7] Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
    Yang, Jun
    Bai, Lizhi
    Sun, Yaoru
    Tian, Chunqi
    Mao, Maoyu
    Wang, Guorun
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1481 - 1492
  • [8] SASD: A Shape-Aware Saliency Object Detection Approach for RGB-D Images
    Zi, Lingling
    Cong, Xin
    [J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 179 - 190
  • [9] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
    Cao, Jinming
    Leng, Hanchao
    Cohen-Or, Daniel
    Lischinski, Dani
    Chen, Ying
    Tu, Changhe
    Li, Yangyan
    [J]. Neurocomputing, 2021, 462 : 568 - 580
  • [10] Multi-scale fusion for RGB-D indoor semantic segmentation
    Jiang, Shiyi
    Xu, Yang
    Li, Danyang
    Fan, Runze
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01):