PGDENet: Progressive Guided Fusion and Depth Enhancement Network for RGB-D Indoor Scene Parsing

被引:35
|
作者
Zhou, Wujie [1 ]
Yang, Enquan [1 ]
Lei, Jingsheng [1 ]
Wan, Jian [1 ]
Yu, Lu [2 ]
机构
[1] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[2] Zhejiang Univ, Inst Informat & Commun Engn, Hangzhou 310023, Peoples R China
基金
中国国家自然科学基金;
关键词
Cross-level information; depth enhancement; modality-specific fusion; progressive guided fusion; RGB-D indoor scene parsing; INFORMATION;
D O I
10.1109/TMM.2022.3161852
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Scene parsing is a fundamental task in computer vision. Various RGB-D (color and depth) scene parsing methods based on fully convolutional networks have achieved excellent performance. However, color and depth information are different in nature and existing methods cannot optimize the cooperation of high-level and low-level information when aggregating modal information, which introduces noise or loss of key information in the aggregated features and generates inaccurate segmentation maps. The features extracted from the depth branch are weak because of the low quality of the depth map, which results in unsatisfactory feature representation. To address these drawbacks, we propose a progressive guided fusion and depth enhancement network (PGDENet) for RGB-D indoor scene parsing. First, high-quality RGB images are used to improve depth data through a depth enhancement module, in which the depth maps are strengthened in terms of channel and spatial correlations. Then, we integrate information from the RGB and enhance depth modalities using a progressive complementary fusion module, in which we start with high-level semantic information and move down layerwise to guide the fusion of adjacent layers while reducing hierarchy-based differences. Extensive experiments are conducted on two public indoor scene datasets, and the results show that the proposed PGDENet outperforms state-of-the-art methods in RGB-D scene parsing.
引用
下载
收藏
页码:3483 / 3494
页数:12
相关论文
共 50 条
  • [41] 2D-3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification
    Mosella-Montoro, Albert
    Ruiz-Hidalgo, Javier
    INFORMATION FUSION, 2021, 76 : 46 - 54
  • [42] RGB-D Scene Image Fusion Algorithm Based on Sparse Atom Fusion
    Liu Fan
    Liu Pengyuan
    Zhang Junning
    Xu Binbin
    ACTA OPTICA SINICA, 2018, 38 (01)
  • [43] AGWNet: Attention-guided adaptive shuffle channel gate warped feature network for indoor scene RGB-D semantic segmentation
    Xiong B.
    Peng Y.
    Zhu J.
    Gu J.
    Chen Z.
    Qin W.
    Displays, 2024, 83
  • [44] Depth awakens: A depth-perceptual attention fusion network for RGB-D camouflaged object detection
    Liu, Xinran
    Qi, Lin
    Song, Yuxuan
    Wen, Qi
    IMAGE AND VISION COMPUTING, 2024, 143
  • [45] Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs
    Wei Li
    Junhua Gu
    Yongfeng Dong
    Yao Dong
    Jungong Han
    Multimedia Tools and Applications, 2020, 79 : 35475 - 35489
  • [46] Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs
    Li, Wei
    Gu, Junhua
    Dong, Yongfeng
    Dong, Yao
    Han, Jungong
    MULTIMEDIA TOOLS AND APPLICATIONS, 2020, 79 (47-48) : 35475 - 35489
  • [47] Progressive Guided Fusion Network With Multi-Modal and Multi-Scale Attention for RGB-D Salient Object Detection
    Wu, Jiajia
    Han, Guangliang
    Wang, Haining
    Yang, Hang
    Li, Qingqing
    Liu, Dongxu
    Ye, Fangjian
    Liu, Peixun
    IEEE ACCESS, 2021, 9 : 150608 - 150622
  • [48] EF-Net: A novel enhancement and fusion network for RGB-D saliency detection
    Chen, Qian
    Fu, Keren
    Liu, Ze
    Chen, Geng
    Du, Hongwei
    Qiu, Bensheng
    Shao, Ling
    PATTERN RECOGNITION, 2021, 112
  • [49] BCINet: Bilateral cross-modal interaction network for indoor scene understanding in RGB-D images
    Zhou, Wujie
    Yue, Yuchun
    Fang, Meixin
    Qian, Xiaohong
    Yang, Rongwang
    Yu, Lu
    INFORMATION FUSION, 2023, 94 : 32 - 42
  • [50] SEMANTICS-GUIDED MULTI-LEVEL RGB-D FEATURE FUSION FOR INDOOR SEMANTIC SEGMENTATION
    Li, Yabei
    Zhang, Junge
    Cheng, Yanhua
    Huang, Kaiqi
    Tan, Tieniu
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2017, : 1262 - 1266