FIMKD: Feature-Implicit Mapping Knowledge Distillation for RGB-D Indoor Scene Semantic Segmentation

被引:0
|
作者
Zhejiang University of Science & Technology, School of Information & Electronic Engineering, Hangzhou [1 ]
310023, China
不详 [2 ]
308232, Singapore
不详 [3 ]
430074, China
不详 [4 ]
315211, China
机构
来源
IEEE. Trans. Artif. Intell. | 2024年 / 12卷 / 6488-6499期
基金
中国国家自然科学基金;
关键词
Deep learning - Image coding - Image enhancement - Inference engines - Mapping - Metadata - Network coding - Personnel training - RGB color model - Steganography - Students - Teaching;
D O I
10.1109/TAI.2024.3452052
中图分类号
学科分类号
摘要
Depth images are often used to improve the geometric understanding of scenes owing to their intuitive distance properties. Although there have been significant advancements in semantic segmentation tasks using red-green-blue-depth (RGB-D) images, the complexity of existing methods remains high. Furthermore, the requirement for high-quality depth images increases the model inference time, which limits the practicality of these methods. To address this issue, we propose a feature-implicit mapping knowledge distillation (FIMKD) method and a cross-modal knowledge distillation (KD) architecture to leverage deep modal information for training and reduce the model dependence on this information during inference. The approach comprises two networks: FIMKD-T, a teacher network that uses RGB-D data, and FIMKD-S, a student network that uses only RGB data. FIMKD-T extracts high-frequency information using the depth modality and compensates for the loss of RGB details due to a reduction in resolution during feature extraction by the high-frequency feature enhancement module, thereby enhancing the geometric perception of semantic features. In contrast, the FIMKD-S network does not employ deep learning techniques; instead, it uses a nonlearning approach to extract high-frequency information. To enable the FIMKD-S network to learn deep features, we propose a feature-implicit mapping KD for feature distillation. This mapping technique maps the features in channel and space to a low-dimensional hidden layer, which helps to avoid inefficient single-pattern student learning. We evaluated the proposed FIMKD-S∗ (FIMKD-S with KD) on the NYUv2 and SUN-RGBD datasets. The results demonstrate that both FIMKD-T and FIMKD-S∗ achieve state-of-the-art performance. Furthermore, FIMKD-S∗ provides the best performance balance. © 2020 IEEE.
引用
收藏
相关论文
共 50 条
  • [1] FGMNet: Feature grouping mechanism network for RGB-D indoor scene semantic segmentation
    Zhang, Yuming
    Zhou, Wujie
    Ye, Lv
    Yu, Lu
    Luo, Ting
    DIGITAL SIGNAL PROCESSING, 2024, 149
  • [2] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
    Seichter, Daniel
    Koehler, Mona
    Lewandowski, Benjamin
    Wengefeld, Tim
    Gross, Horst-Michael
    2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 13525 - 13531
  • [3] RGB-D Gate-guided edge distillation for indoor semantic segmentation
    Wenbin Zou
    Yingqing Peng
    Zhengyu Zhang
    Shishun Tian
    Xia Li
    Multimedia Tools and Applications, 2022, 81 : 35815 - 35830
  • [4] RGB-D Gate-guided edge distillation for indoor semantic segmentation
    Zou, Wenbin
    Peng, Yingqing
    Zhang, Zhengyu
    Tian, Shishun
    Li, Xia
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (25) : 35815 - 35830
  • [5] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
    Hong Liu
    Wenshan Wu
    Xiangdong Wang
    Yueliang Qian
    Multimedia Tools and Applications, 2018, 77 : 22475 - 22488
  • [6] RGB-D joint modelling with scene geometric information for indoor semantic segmentation
    Liu, Hong
    Wu, Wenshan
    Wang, Xiangdong
    Qian, Yueliang
    MULTIMEDIA TOOLS AND APPLICATIONS, 2018, 77 (17) : 22475 - 22488
  • [7] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    IEEE Transactions on Automation Science and Engineering, 2023, : 1 - 11
  • [8] CMPFFNet: Cross-Modal and Progressive Feature Fusion Network for RGB-D Indoor Scene Semantic Segmentation
    Zhou, Wujie
    Xiao, Yuxiang
    Yan, Weiqing
    Yu, Lu
    IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2023, : 1 - 11
  • [9] RAFNet: RGB-D attention feature fusion network for indoor semantic segmentation
    Yan, Xingchao
    Hou, Sujuan
    Karim, Awudu
    Jia, Weikuan
    DISPLAYS, 2021, 70
  • [10] DEPTH REMOVAL DISTILLATION FOR RGB-D SEMANTIC SEGMENTATION
    Fang, Tiyu
    Liang, Zhen
    Shao, Xiuli
    Dong, Zihao
    Li, Jinping
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 2405 - 2409