FIMKD: Feature-Implicit Mapping Knowledge Distillation for RGB-D Indoor Scene Semantic Segmentation

被引:0
|
作者
Zhejiang University of Science & Technology, School of Information & Electronic Engineering, Hangzhou [1 ]
310023, China
不详 [2 ]
308232, Singapore
不详 [3 ]
430074, China
不详 [4 ]
315211, China
机构
来源
IEEE. Trans. Artif. Intell. | 2024年 / 12卷 / 6488-6499期
基金
中国国家自然科学基金;
关键词
Deep learning - Image coding - Image enhancement - Inference engines - Mapping - Metadata - Network coding - Personnel training - RGB color model - Steganography - Students - Teaching;
D O I
10.1109/TAI.2024.3452052
中图分类号
学科分类号
摘要
Depth images are often used to improve the geometric understanding of scenes owing to their intuitive distance properties. Although there have been significant advancements in semantic segmentation tasks using red-green-blue-depth (RGB-D) images, the complexity of existing methods remains high. Furthermore, the requirement for high-quality depth images increases the model inference time, which limits the practicality of these methods. To address this issue, we propose a feature-implicit mapping knowledge distillation (FIMKD) method and a cross-modal knowledge distillation (KD) architecture to leverage deep modal information for training and reduce the model dependence on this information during inference. The approach comprises two networks: FIMKD-T, a teacher network that uses RGB-D data, and FIMKD-S, a student network that uses only RGB data. FIMKD-T extracts high-frequency information using the depth modality and compensates for the loss of RGB details due to a reduction in resolution during feature extraction by the high-frequency feature enhancement module, thereby enhancing the geometric perception of semantic features. In contrast, the FIMKD-S network does not employ deep learning techniques; instead, it uses a nonlearning approach to extract high-frequency information. To enable the FIMKD-S network to learn deep features, we propose a feature-implicit mapping KD for feature distillation. This mapping technique maps the features in channel and space to a low-dimensional hidden layer, which helps to avoid inefficient single-pattern student learning. We evaluated the proposed FIMKD-S∗ (FIMKD-S with KD) on the NYUv2 and SUN-RGBD datasets. The results demonstrate that both FIMKD-T and FIMKD-S∗ achieve state-of-the-art performance. Furthermore, FIMKD-S∗ provides the best performance balance. © 2020 IEEE.
引用
收藏
相关论文
共 50 条
  • [31] Salient Semantic Segmentation Based on RGB-D Camera for Robot Semantic Mapping
    Hu, Lihe
    Zhang, Yi
    Wang, Yang
    Yang, Huan
    Tan, Shuyi
    APPLIED SCIENCES-BASEL, 2023, 13 (06):
  • [32] RGB-D SEMANTIC SEGMENTATION: A REVIEW
    Hu, Yaosi
    Chen, Zhenzhong
    Lin, Weiyao
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW 2018), 2018,
  • [33] An Efficient RGB-D Indoor Scene-Parsing Solution via Lightweight Multiflow Intersection and Knowledge Distillation
    Zhou, Wujie
    Zhang, Yuming
    Yan, Weiqing
    Ye, Lv
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2024, 18 (03) : 336 - 345
  • [34] Combining ElasticFusion with PSPNet for RGB-D based Indoor Semantic Mapping
    Wang, Weiqi
    Yang, Jian
    You, Xiong
    2018 CHINESE AUTOMATION CONGRESS (CAC), 2018, : 2996 - 3001
  • [35] The Real-time Segmentation of Indoor Scene Based on RGB-D Sensor
    Du, Chengpeng
    Zeng, Chunnian
    Xu, Fan
    Liang, Hong
    2014 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND BIOMIMETICS IEEE-ROBIO 2014, 2014, : 677 - 682
  • [36] Triple fusion and feature pyramid decoder for RGB-D semantic segmentation
    Ge, Bin
    Zhu, Xu
    Tang, Zihan
    Xia, Chenxing
    Lu, Yiming
    Chen, Zhuang
    MULTIMEDIA SYSTEMS, 2024, 30 (05)
  • [37] Self-Enhanced Feature Fusion for RGB-D Semantic Segmentation
    Xiang, Pengcheng
    Yao, Baochen
    Jiang, Zefeng
    Peng, Chengbin
    IEEE Signal Processing Letters, 2024, 31 : 3015 - 3019
  • [38] RGB-D Mapping for indoor environment
    Wang, Yalong
    Zhang, Qizhi
    Zhou, Yali
    PROCEEDINGS OF THE 2014 9TH IEEE CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2014, : 1888 - 1892
  • [39] Lightweight Dual Stream Network With Knowledge Distillation for RGB-D Scene Parsing
    Zhang, Yuming
    Zhou, Wujie
    Ran, Xiaoxiao
    Fang, Meixin
    IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 855 - 859
  • [40] RGBxD: Learning depth-weighted RGB patches for RGB-D indoor semantic segmentation
    Cao, Jinming
    Leng, Hanchao
    Cohen-Or, Daniel
    Lischinski, Dani
    Chen, Ying
    Tu, Changhe
    Li, Yangyan
    NEUROCOMPUTING, 2021, 462 : 568 - 580