ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation

被引：58

作者：

Cao, Jinming ^{[1
]}

Leng, Hanchao ^{[1
]}

Lischinski, Dani ^{[2
]}

Cohen-Or, Danny ^{[3
]}

Tu, Changhe ^{[1
]}

Li, Yangyan ^{[4
]}

机构：

[1] Shandong Univ, Jinan, Peoples R China

[2] Hebrew Univ Jerusalem, Jerusalem, Israel

[3] Tel Aviv Univ, Tel Aviv, Israel

[4] Alibaba Grp, Hangzhou, Peoples R China

来源：

2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021) | 2021年

基金：

美国国家科学基金会;

关键词：

D O I：

10.1109/ICCV48922.2021.00700

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB-D semantic segmentation has attracted increasing attention over the past few years. Existing methods mostly employ homogeneous convolution operators to consume the RGB and depth features, ignoring their intrinsic differences. In fact, the RGB values capture the photometric appearance properties in the projected image space, while the depth feature encodes both the shape of a local geometry as well as the base (whereabout) of it in a larger context. Compared with the base, the shape probably is more inherent and has a stronger connection to the semantics, and thus is more critical for segmentation accuracy. Inspired by this observation, we introduce a Shape-aware Convolutional layer (ShapeConv) for processing the depth feature, where the depth feature is firstly decomposed into a shape-component and a base-component, next two learnable weights are introduced to cooperate with them independently, and finally a convolution is applied on the re-weighted combination of these two components. ShapeConv is model-agnostic and can be easily integrated into most CNNs to replace vanilla convolutional layers for semantic segmentation. Extensive experiments on three challenging indoor RGB-D semantic segmentation benchmarks, i.e., NYU-Dv2(-13,-40), SUN RGB-D, and SID, demonstrate the effectiveness of our ShapeConv when employing it over five popular architectures. Moreover, the performance of CNNs with ShapeConv is boosted without introducing any computation and memory increase in the inference phase. The reason is that the learnt weights for balancing the importance between the shape and base components in ShapeConv become constants in the inference phase, and thus can be fused into the following convolution, resulting in a network that is identical to one with vanilla convolutional layers.

引用

页码：7068 / 7077

页数：10

共 50 条

[1] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
Duan, Li-Juan
Sun, Qi-Chao
Qiao, Yuan-Hua
Chen, Jun-Cheng
Cui, Guo-Qin
[J]. Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
[2] Review on Indoor RGB-D Semantic Segmentation with Deep Convolutional Neural Networks
Barchid, Sami
Mennesson, Jose
Djeraba, Chaabane
[J]. 2021 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2021, : 199 - 202
[3] Regularized Fully Convolutional Networks for RGB-D Semantic Segmentation
Su, Wen
Wang, Zengfu
[J]. 2016 30TH ANNIVERSARY OF VISUAL COMMUNICATION AND IMAGE PROCESSING (VCIP), 2016,
[4] Scale-aware network with modality-awareness for RGB-D indoor semantic segmentation
Zhou, Feng
Lai, Yu-Kun
Rosin, Paul L.
Zhang, Fengquan
Hu, Yong
[J]. NEUROCOMPUTING, 2022, 492 : 464 - 473
[5] Accurate semantic segmentation of RGB-D images for indoor navigation
Sharan, Sudeep
Nauth, Peter
Dominguez-Jimenez, Juan-Jose
[J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
[6] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
Seichter, Daniel
Koehler, Mona
Lewandowski, Benjamin
Wengefeld, Tim
Gross, Horst-Michael
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13525 - 13531
[7] Pixel Difference Convolutional Network for RGB-D Semantic Segmentation
Yang, Jun
Bai, Lizhi
Sun, Yaoru
Tian, Chunqi
Mao, Maoyu
Wang, Guorun
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (03) : 1481 - 1492
[8] SASD: A Shape-Aware Saliency Object Detection Approach for RGB-D Images
Zi, Lingling
Cong, Xin
[J]. ARTIFICIAL INTELLIGENCE, CICAI 2022, PT I, 2022, 13604 : 179 - 190
[9] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
Cao, Jinming
Leng, Hanchao
Cohen-Or, Daniel
Lischinski, Dani
Chen, Ying
Tu, Changhe
Li, Yangyan
[J]. Neurocomputing, 2021, 462 : 568 - 580
[10] Multi-scale fusion for RGB-D indoor semantic segmentation
Jiang, Shiyi
Xu, Yang
Li, Danyang
Fan, Runze
[J]. SCIENTIFIC REPORTS, 2022, 12 (01):

← 1 2 3 4 5 →