Contrastive learning-based knowledge distillation for RGB-thermal urban scene semantic segmentation

被引:0
|
作者
Guo, Xiaodong [1 ]
Zhou, Wujie [2 ,3 ]
Liu, Tong
机构
[1] Beijing Inst Technol, Sch Automation, Beijing 100081, Peoples R China
[2] Zhejiang Univ Sci & Technol, Sch Informat & Elect Engn, Hangzhou 310023, Peoples R China
[3] Nanyang Technol Univ, Sch Comp Sci & Engn, Singapore 308232, Singapore
基金
中国国家自然科学基金;
关键词
RGB-Thermal semantic segmentation; Urban scene; Autonomous driving; Knowledge distillation; Contrastive learning; NETWORK;
D O I
10.1016/j.knosys.2024.111588
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB thermal semantic segmentation facilitates unmanned platforms to perceive and characterize their surrounding environment, which is critical for autonomous driving tasks. Deep -learning -based algorithms have achieved dominance in terms of accuracy and robustness. However, their large parameter sizes and significant computational demands impede their application in terminal devices. To address this challenge, we propose a novel strategy for achieving a balance between effectiveness and compactness. It includes a robust teacher network, CLNet-T, and a streamlined student network, CLNet-S. Using knowledge distillation (KD), we obtained an optimized model called CLNet-S*. Specifically, CLNet-T and CLNet-S were identical in all aspects except for the feature extraction component. They included a multi -attribute hierarchical feature interaction module (MHFI) and a detail -guided semantic decoder (DGSD). The MHFI initially filters features by considering the characteristics of the low- and high-level features. It gradually combines complementary and common features from various modalities in distinct receptive fields. DGSD uses edge and distribution information to guide semantic decoding, thereby improving the segmentation accuracy at class boundaries. To enhance the performance of the compact student model, our KD strategy includes detail, semantic response distillation (DSRD), and contrastive learning -based feature distillation (CLFD). Practically, DSRD enables the student model to gain knowledge from the teacher model at both the detailed and semantic levels. At the same time, CLFD increases the similarity of features within the same categories and emphasizes the distinctiveness of features between different categories in both the student and teacher models. Extensive experiments conducted on two standard datasets have consistently demonstrated that both CLNet-T and CLNet-S* outperform other state-of-the-art methods. The code and results are available at https://github.com/xiaodonguo/CLNet.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Multispectral Fusion Transformer Network for RGB-Thermal Urban Scene Semantic Segmentation
    Zhou, Heng
    Tian, Chunna
    Zhang, Zhenxi
    Huo, Qizheng
    Xie, Yongqiang
    Li, Zhongbo
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [2] GMNet: Graded-Feature Multilabel-Learning Network for RGB-Thermal Urban Scene Semantic Segmentation
    Zhou, Wujie
    Liu, Jinfu
    Lei, Jingsheng
    Yu, Lu
    Hwang, Jenq-Neng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7790 - 7802
  • [3] RTFNet: RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes
    Sun, Yuxiang
    Zuo, Weixun
    Liu, Ming
    [J]. IEEE ROBOTICS AND AUTOMATION LETTERS, 2019, 4 (03): : 2576 - 2583
  • [4] Robust semantic segmentation based on RGB-thermal in variable lighting scenes
    Guo, Zhifeng
    Li, Xu
    Xu, Qimin
    Sun, Zhengliang
    [J]. MEASUREMENT, 2021, 186
  • [5] Temporal Consistency for RGB-Thermal Data-Based Semantic Scene Understanding
    Li, Haotian
    Chu, Henry K.
    Sun, Yuxiang
    [J]. IEEE Robotics and Automation Letters, 2024, 9 (11) : 9757 - 9764
  • [6] Residual spatial fusion network for RGB-thermal semantic segmentation
    Li, Ping
    Chen, Junjie
    Lin, Binbin
    Xu, Xianghua
    [J]. NEUROCOMPUTING, 2024, 595
  • [7] Dual-Space Graph-Based Interaction Network for RGB-Thermal Semantic Segmentation in Electric Power Scene
    Xu, Chang
    Li, Qingwu
    Jiang, Xiongbiao
    Yu, Dabing
    Zhou, Yaqin
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (04) : 1577 - 1592
  • [8] Deep learning-based RGB-thermal image denoising: review and applications
    Yu, Yuan
    Lee, Boon Giin
    Pike, Matthew
    Zhang, Qian
    Chung, Wan-Young
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (04) : 11613 - 11641
  • [9] Contrastive Learning-Based Domain Adaptation for Semantic Segmentation
    Bhagwatkar, Rishika
    Kemekar, Saurabh
    Domatoti, Vinay
    Khan, Khursheed Munir
    Singh, Anamika
    [J]. 2022 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2022, : 239 - 244
  • [10] SGFNet: Semantic-Guided Fusion Network for RGB-Thermal Semantic Segmentation
    WangLi, Yike
    Li, Gongyang
    Liu, Zhi
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (12) : 7737 - 7748