DGPINet-KD: Deep Guided and Progressive Integration Network with Knowledge Distillation for RGB-D Indoor Scene Analysis

被引:1
|
作者
Zhou W. [1 ]
Jian B. [1 ]
Fang M. [1 ]
Dong X. [1 ]
Liu Y. [4 ]
Jiang Q. [5 ]
机构
[1] Technology, Hangzhou
[2] School of Computer Science and Engineering, Nanyang Technological University, Singapore
[3] School of Information Science and Engineering, Ningbo University, Ningbo
关键词
branch attention; Circuits and systems; Computational modeling; Convolution; depth guidance; Feature extraction; indoor scene analysis; knowledge distillation; Logic gates; RGB-D data; Semantic segmentation; Semantics;
D O I
10.1109/TCSVT.2024.3382354
中图分类号
学科分类号
摘要
Significant advancements in RGB-D semantic segmentation have been made owing to the increasing availability of robust depth information. Most researchers have combined depth with RGB data to capture complementary information in images. Although this approach improves segmentation performance, it requires excessive model parameters. To address this problem, we propose DGPINet-KD, a deep-guided and progressive integration network with knowledge distillation (KD) for RGB-D indoor scene analysis. First, we used branching attention and depth guidance to capture coordinated, precise location information and extract more complete spatial information from the depth map to complement the semantic information for the encoded features. Second, we trained the student network (DGPINet-S) with a well-trained teacher network (DGPINet-T) using a multilevel KD. Third, an integration unit was developed to explore the contextual dependencies of the decoding features and to enhance relational KD. Comprehensive experiments on two challenging indoor benchmark datasets, NYUDv2 and SUN RGB-D, demonstrated that DGPINet-KD achieved improved performance in indoor scene analysis tasks compared with existing methods. Notably, on the NYUDv2 dataset, DGPINet-KD (DGPINet-S with KD) achieves a pixel accuracy gain of 1.7% and a class accuracy gain of 2.3% compared with DGPINet-S. In addition, compared with DGPINet-T, the proposed DGPINet-KD (DGPINet-S with KD) utilizes significantly fewer parameters (29.3M) while maintaining accuracy. The source code is available at https://github.com/XUEXIKUAIL/DGPINet. IEEE
引用
下载
收藏
页码:1 / 1
相关论文
共 36 条
  • [31] User-Guided Dimensional Analysis of Indoor Building Environments from Single Frames of RGB-D Sensors
    Xiao, Yong
    Feng, Chen
    Taguchi, Yuichi
    Kamat, Vineet R.
    JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2017, 31 (04)
  • [32] 2D-3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification
    Mosella-Montoro, Albert
    Ruiz-Hidalgo, Javier
    INFORMATION FUSION, 2021, 76 : 46 - 54
  • [33] RGB-D Human Posture Analysis for Ergonomic Studies using Deep Convolutional Neural Network
    Abobakr, Ahmed
    Nahavandi, Darius
    Iskander, Julie
    Hossny, Mohammed
    Nahavandi, Saeid
    Smets, Marty
    2017 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2017, : 2885 - 2890
  • [34] MAPNet: Multi-modal attentive pooling network for RGB-D indoor scene classification (vol 90, pg 436, 2019)
    Li, Yabei
    Zhang, Zhang
    Cheng, Yanhua
    Wang, Liang
    Tan, Tieniu
    PATTERN RECOGNITION, 2019, 94 : 250 - 250
  • [35] 2D–3D Geometric Fusion network using Multi-Neighbourhood Graph Convolution for RGB-D indoor scene classification
    Mosella-Montoro, Albert
    Ruiz-Hidalgo, Javier
    Information Fusion, 2021, 76 : 46 - 54
  • [36] Progressive Guided Fusion Network With Multi-Modal and Multi-Scale Attention for RGB-D Salient Object Detection
    Wu, Jiajia
    Han, Guangliang
    Wang, Haining
    Yang, Hang
    Li, Qingqing
    Liu, Dongxu
    Ye, Fangjian
    Liu, Peixun
    IEEE ACCESS, 2021, 9 : 150608 - 150622