FIMKD: Feature-Implicit Mapping Knowledge Distillation for RGB-D Indoor Scene Semantic Segmentation

被引:0
|
作者
Zhejiang University of Science & Technology, School of Information & Electronic Engineering, Hangzhou [1 ]
310023, China
不详 [2 ]
308232, Singapore
不详 [3 ]
430074, China
不详 [4 ]
315211, China
机构
来源
IEEE. Trans. Artif. Intell. | 2024年 / 12卷 / 6488-6499期
基金
中国国家自然科学基金;
关键词
Deep learning - Image coding - Image enhancement - Inference engines - Mapping - Metadata - Network coding - Personnel training - RGB color model - Steganography - Students - Teaching;
D O I
10.1109/TAI.2024.3452052
中图分类号
学科分类号
摘要
Depth images are often used to improve the geometric understanding of scenes owing to their intuitive distance properties. Although there have been significant advancements in semantic segmentation tasks using red-green-blue-depth (RGB-D) images, the complexity of existing methods remains high. Furthermore, the requirement for high-quality depth images increases the model inference time, which limits the practicality of these methods. To address this issue, we propose a feature-implicit mapping knowledge distillation (FIMKD) method and a cross-modal knowledge distillation (KD) architecture to leverage deep modal information for training and reduce the model dependence on this information during inference. The approach comprises two networks: FIMKD-T, a teacher network that uses RGB-D data, and FIMKD-S, a student network that uses only RGB data. FIMKD-T extracts high-frequency information using the depth modality and compensates for the loss of RGB details due to a reduction in resolution during feature extraction by the high-frequency feature enhancement module, thereby enhancing the geometric perception of semantic features. In contrast, the FIMKD-S network does not employ deep learning techniques; instead, it uses a nonlearning approach to extract high-frequency information. To enable the FIMKD-S network to learn deep features, we propose a feature-implicit mapping KD for feature distillation. This mapping technique maps the features in channel and space to a low-dimensional hidden layer, which helps to avoid inefficient single-pattern student learning. We evaluated the proposed FIMKD-S∗ (FIMKD-S with KD) on the NYUv2 and SUN-RGBD datasets. The results demonstrate that both FIMKD-T and FIMKD-S∗ achieve state-of-the-art performance. Furthermore, FIMKD-S∗ provides the best performance balance. © 2020 IEEE.
引用
收藏
相关论文
共 50 条
  • [41] RGB-D Semantic Segmentation for Indoor Modeling Using Deep Learning: A Review
    Rached, Ishraq
    Hajji, Rafika
    Landes, Tania
    RECENT ADVANCES IN 3D GEOINFORMATION SCIENCE, 3D GEOINFO 2023, 2024, : 587 - 604
  • [42] Review on Indoor RGB-D Semantic Segmentation with Deep Convolutional Neural Networks
    Barchid, Sami
    Mennesson, Jose
    Djeraba, Chaabane
    2021 INTERNATIONAL CONFERENCE ON CONTENT-BASED MULTIMEDIA INDEXING (CBMI), 2021, : 199 - 202
  • [43] DGPINet-KD: Deep Guided and Progressive Integration Network with Knowledge Distillation for RGB-D Indoor Scene Analysis
    Zhou W.
    Jian B.
    Fang M.
    Dong X.
    Liu Y.
    Jiang Q.
    IEEE Transactions on Circuits and Systems for Video Technology, 2024, 34 (09) : 1 - 1
  • [44] Attention-Aware and Semantic-Aware Network for RGB-D Indoor Semantic Segmentation
    Duan L.-J.
    Sun Q.-C.
    Qiao Y.-H.
    Chen J.-C.
    Cui G.-Q.
    Jisuanji Xuebao/Chinese Journal of Computers, 2021, 44 (02): : 275 - 291
  • [45] Incremental Instance-Oriented 3D Semantic Mapping via RGB-D Cameras for Unknown Indoor Scene
    Li, Wei
    Gu, Junhua
    Chen, Benwen
    Han, Jungong
    DISCRETE DYNAMICS IN NATURE AND SOCIETY, 2020, 2020
  • [46] PYRAMID-CONTEXT GUIDED FEATURE FUSION FOR RGB-D SEMANTIC SEGMENTATION
    Liu, Haoming
    Guo, Li
    Zhou, Zhongwen
    Zhang, Hanyuan
    2022 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO WORKSHOPS (IEEE ICMEW 2022), 2022,
  • [47] RGB-D Co-Segmentation on Indoor Scene with Geometric Prior and Hypothesis Filtering
    Hang, Lingxiao
    Cao, Zhiguo
    Xiao, Yang
    Lu, Hao
    PATTERN RECOGNITION AND COMPUTER VISION (PRCV 2018), PT I, 2018, 11256 : 168 - 179
  • [48] Joining geometric and RGB features for RGB-D semantic segmentation
    Zhang, Shaopeng
    Zhong, Min
    Zeng, Gang
    Gan, Rui
    2019 INTERNATIONAL CONFERENCE ON IMAGE AND VIDEO PROCESSING, AND ARTIFICIAL INTELLIGENCE, 2019, 11321
  • [49] Semantic Mapping Using Object-Class Segmentation of RGB-D Images
    Stueckler, Joerg
    Biresev, Nenad
    Behnke, Sven
    2012 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2012, : 3005 - 3010
  • [50] Semantic Segmentation Networks of 3D Point Clouds for RGB-D Indoor Scenes
    Wang, Ya
    Zell, Andreas
    TWELFTH INTERNATIONAL CONFERENCE ON MACHINE VISION (ICMV 2019), 2020, 11433