PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing

被引:35
|
作者
Zhou, Wujie [1 ]
Yang, Enquan [1 ]
Lei, Jingsheng [1 ]
Wan, Jian [1 ]
Yu, Lu [2 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ, Inst Informat & Commun Engn, Hangzhou 310023, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-level information; depth enhancement; modality-specific fusion; progressive guided fusion; RGB-D indoor scene parsing; INFORMATION;
D O I
10.1109/TMM.2022.3161852
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene parsing is a fundamental task in computer vision. Various RGB-D (color and depth) scene parsing methods based on fully convolutional networks have achieved excellent performance. However, color and depth information are different in nature and existing methods cannot optimize the cooperation of high-level and low-level information when aggregating modal information, which introduces noise or loss of key information in the aggregated features and generates inaccurate segmentation maps. The features extracted from the depth branch are weak because of the low quality of the depth map, which results in unsatisfactory feature representation. To address these drawbacks, we propose a progressive guided fusion and depth enhancement network (PGDENet) for RGB-D indoor scene parsing. First, high-quality RGB images are used to improve depth data through a depth enhancement module, in which the depth maps are strengthened in terms of channel and spatial correlations. Then, we integrate information from the RGB and enhance depth modalities using a progressive complementary fusion module, in which we start with high-level semantic information and move down layerwise to guide the fusion of adjacent layers while reducing hierarchy-based differences. Extensive experiments are conducted on two public indoor scene datasets, and the results show that the proposed PGDENet outperforms state-of-the-art methods in RGB-D scene parsing.
引用
收藏
页码:3483 / 3494
页数:12
相关论文
共 50 条
  • [1] FRNet: Feature Reconstruction Network for RGB-D Indoor Scene Parsing
    Zhou, Wujie
    Yang, Enquan
    Lei, Jingsheng
    Yu, Lu
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2022, 16 (04) : 677 - 687
  • [2] FASFLNet: feature adaptive selection and fusion lightweight network for RGB-D indoor scene parsing
    Qian, Xiaohong
    Lin, Xingyang
    Yu, Lu
    Zhou, Wujie
    [J]. OPTICS EXPRESS, 2023, 31 (05) : 8029 - 8041
  • [3] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    [J]. IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
  • [4] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    [J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, : 1 - 11
  • [5] CCFNet: Cross-Complementary fusion network for RGB-D scene parsing of clothing images
    Xu, Gao
    Zhou, Wujie
    Qian, Xiaohong
    Ye, Lv
    Lei, Jingsheng
    Yu, Lu
    [J]. JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2023, 90
  • [6] DGPINet-KD: Deep Guided and Progressive Integration Network With Knowledge Distillation for RGB-D Indoor Scene Analysis
    Zhou, Wujie
    Jian, Bitao
    Fang, Meixin
    Dong, Xiena
    Liu, Yuanyuan
    Jiang, Qiuping
    [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (09) : 7844 - 7855
  • [7] DEPTH ENHANCEMENT USING RGB-D GUIDED FILTERING
    Hui, Tak-Wai
    Ngan, King Ngi
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2014, : 3832 - 3836
  • [8] RFNet: Reverse Fusion Network With Attention Mechanism for RGB-D Indoor Scene Understanding
    Zhou, Wujie
    Lv, Sijia
    Lei, Jingsheng
    Luo, Ting
    Yu, Lu
    [J]. IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2023, 7 (02): : 598 - 603
  • [9] ResFusion: deeply fused scene parsing network for RGB-D images
    Dai, Juting
    Tang, Xinyi
    [J]. IET COMPUTER VISION, 2018, 12 (08) : 1171 - 1178
  • [10] DMFNet: Deep Multi-Modal Fusion Network for RGB-D Indoor Scene Segmentation
    Yuan, Jianzhong
    Zhou, Wujie
    Luo, Ting
    [J]. IEEE ACCESS, 2019, 7 : 169350 - 169358