EFDCNet: Encoding fusion and decoding correction network for RGB-D indoor semantic segmentation

被引:1
|
作者
Chen, Jianlin [1 ,2 ]
Li, Gongyang [1 ,2 ]
Zhang, Zhijiang [1 ,2 ]
Zeng, Dan [1 ,2 ]
机构
[1] Shanghai Univ, Shanghai Inst Adv Commun & Data Sci, Shanghai 200444, Peoples R China
[2] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
关键词
RGB-D indoor semantic segmentation; Encoding fusion; Decoding correction;
D O I
10.1016/j.imavis.2023.104892
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Semantic segmentation is a crucial task in vision measurement systems that involves understanding and segmenting different objects and regions within an image. Over the years, numerous RGB-D semantic segmentation methods have been developed, leveraging the encoder -decoder architecture to achieve outstanding performance. However, existing methods have two main problems that constrain further performance improvement. Firstly, in the encoding stage, existing methods have a weak ability to fuse cross -modal information, and low -quality depth maps can easily lead to poor feature representation. Secondly, in the decoding stage, the upsampling of highlevel semantic information may cause the loss of contextual information, and low-level features from the encoder may bring noises to the decoder through skip connections. To solve these issues, we propose a novel Encoding Fusion and Decoding Correction Network (EFDCNet) for RGB-D indoor semantic segmentation. First, in the encoding stage of EFDCNet, we focus on extracting valuable information from low -quality depth maps, and employ a channel -wise filter to select informative depth features. Additionally, we establish the global dependencies between RGB and depth features via the self -attention mechanism to enhance the cross -modal feature interactions, extracting discriminant and powerful features. Then, in the decoding stage of EFDCNet, we use the highest -level information as semantic guidance to compensate for the upsampling information and filter out noise from the low-level encoder features propagated through the skip connections to the decoder. Extensive experiments conducted on two widely -used RGB-D indoor semantic segmentation datasets demonstrate that the proposed EFDCNet surpasses the performance of relevant state-of-the-art methods. The code is available at https://github.com/ Mark9010/EFDCNet
引用
收藏
页数:11
相关论文
共 50 条
  • [11] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    [J]. IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
  • [12] Cross-modal attention fusion network for RGB-D semantic segmentation
    Zhao, Qiankun
    Wan, Yingcai
    Xu, Jiqian
    Fang, Lijin
    [J]. NEUROCOMPUTING, 2023, 548
  • [13] Accurate semantic segmentation of RGB-D images for indoor navigation
    Sharan, Sudeep
    Nauth, Peter
    Dominguez-Jimenez, Juan-Jose
    [J]. JOURNAL OF ELECTRONIC IMAGING, 2022, 31 (06)
  • [14] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
    Seichter, Daniel
    Koehler, Mona
    Lewandowski, Benjamin
    Wengefeld, Tim
    Gross, Horst-Michael
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13525 - 13531
  • [15] MGCNet: Multilevel Gated Collaborative Network for RGB-D Semantic Segmentation of Indoor Scene
    Yang, Enquan
    Zhou, Wujie
    Qian, Xionghong
    Yu, Lu
    [J]. IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2567 - 2571
  • [16] FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation
    Zhang, Yuming
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    Luo, Ting
    [J]. DIGITAL SIGNAL PROCESSING, 2024, 149
  • [17] RDFNet: RGB-D Multi-level Residual Feature Fusion for Indoor Semantic Segmentation
    Park, Seong-Jin
    Hong, Ki-Sang
    Lee, Seungyong
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 4990 - 4999
  • [18] Locality-Sensitive Deconvolution Networks with Gated Fusion for RGB-D Indoor Semantic Segmentation
    Cheng, Yanhua
    Cai, Rui
    Li, Zhiwei
    Zhao, Xin
    Huang, Kaiqi
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1475 - 1483
  • [19] Indoor RGB-D Image Semantic Segmentation Based on Dual-Stream Weighted Gabor Convolutional Network Fusion
    Xuchu, Wang
    Huihuang, Liu
    Yanmin, Niu
    [J]. ACTA OPTICA SINICA, 2020, 40 (19)
  • [20] RGB×D: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
    Cao, Jinming
    Leng, Hanchao
    Cohen-Or, Daniel
    Lischinski, Dani
    Chen, Ying
    Tu, Changhe
    Li, Yangyan
    [J]. Neurocomputing, 2021, 462 : 568 - 580