Multi-scale fusion for RGB-D indoor semantic segmentation

被引:0
|
作者
Shiyi Jiang
Yang Xu
Danyang Li
Runze Fan
机构
[1] Guizhou University,College of Big Data and Information Engineering
[2] Guiyang Aluminum-magnesium Design and Research Institute Co.,undefined
[3] Ltd.,undefined
关键词
D O I
暂无
中图分类号
学科分类号
摘要
In computer vision, convolution and pooling operations tend to lose high-frequency information, and the contour details will also disappear with the deepening of the network, especially in image semantic segmentation. For RGB-D image semantic segmentation, all the effective information of RGB and depth image can not be used effectively, while the form of wavelet transform can retain the low and high frequency information of the original image perfectly. In order to solve the information losing problems, we proposed an RGB-D indoor semantic segmentation network based on multi-scale fusion: designed a wavelet transform fusion module to retain contour details, a nonsubsampled contourlet transform to replace the pooling operation, and a multiple pyramid module to aggregate multi-scale information and context global information. The proposed method can retain the characteristics of multi-scale information with the help of wavelet transform, and make full use of the complementarity of high and low frequency information. As the depth of the convolutional neural network increases without losing the multi-frequency characteristics, the segmentation accuracy of image edge contour details is also improved. We evaluated our proposed efficient method on commonly used indoor datasets NYUv2 and SUNRGB-D, and the results showed that we achieved state-of-the-art performance and real-time inference.
引用
收藏
相关论文
共 50 条
  • [1] Multi-scale fusion for RGB-D indoor semantic segmentation
    Jiang, Shiyi
    Xu, Yang
    Li, Danyang
    Fan, Runze
    [J]. SCIENTIFIC REPORTS, 2022, 12 (01):
  • [2] RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation
    Park, Seong-Jin
    Hong, Ki-Sang
    Lee, Seungyong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4990 - 4999
  • [3] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
    Yan, Xingchao
    Hou, Sujuan
    Karim, Awudu
    Jia, Weikuan
    [J]. DISPLAYS, 2021, 70
  • [4] SEMANTICS-GUIDED MULTI-LEVEL RGB-D FEATURE FUSION FOR INDOOR SEMANTIC SEGMENTATION
    Li, Yabei
    Zhang, Junge
    Cheng, Yanhua
    Huang, Kaiqi
    Tan, Tieniu
    [J]. 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1262 - 1266
  • [5] EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation
    Chen, Jianlin
    Li, Gongyang
    Zhang, Zhijiang
    Zeng, Dan
    [J]. IMAGE AND VISION COMPUTING, 2024, 142
  • [6] Temporally Consistent Semantic Segmentation using Spatially Aware Multi-view Semantic Fusion for Indoor RGB-D videos
    Sun, Fengyuan
    Karaoglu, Sezer
    Gevers, Theo
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW, 2023, : 4250 - 4259
  • [7] Accurate semantic segmentation of RGB-D images for indoor navigation
    Sharan, Sudeep
    Nauth, Peter
    Dominguez-Jimenez, Juan-Jose
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [8] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
    Seichter, Daniel
    Koehler, Mona
    Lewandowski, Benjamin
    Wengefeld, Tim
    Gross, Horst-Michael
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13525 - 13531
  • [9] Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation
    Cheng, Yanhua
    Cai, Rui
    Li, Zhiwei
    Zhao, Xin
    Huang, Kaiqi
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1475 - 1483
  • [10] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
    Cao, Jinming
    Leng, Hanchao
    Cohen-Or, Daniel
    Lischinski, Dani
    Chen, Ying
    Tu, Changhe
    Li, Yangyan
    [J]. Neurocomputing, 2021, 462 : 568 - 580