TRACL: Temporal reconstruction and adaptive consistency loss for semi-supervised video semantic segmentation

被引:0
|
作者
Liang, Zhixue [1 ,2 ]
Dong, Wenyong [1 ,3 ,4 ]
Zhang, Bo [1 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Wuhan, Peoples R China
[2] Nanyang Inst Technol, Sch Comp & Software, Nanyang, Peoples R China
[3] Xinjiang Univ Polit Sci & Law, Sch Informat Network Secur, Tumushuke, Peoples R China
[4] Wuhan Univ, Wuhan 430072, Peoples R China
关键词
adaptive consistency loss; temporal reconstruction; video semantic segmentation;
D O I
10.1049/ipr2.12952
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While existing supervised semantic segmentation methods have shown significant performance improvements, they heavily rely on large-scale pixel-level annotated data. To reduce this dependence, recent research has proposed semi-supervised learning-based methods that have achieved great success. However, almost all these works are mainly dedicated to image semantic segmentation, while semi-supervised video semantic segmentation (SVSS) has been barely explored. Due to the significant difference between video data and image, simply adapting semi-supervised image semantic segmentation approaches to SVSS may neglect the inherent temporal correlations in video frames. This paper presents a novel method (named TRACL) with temporal reconstruction (TR) and adaptive consistency loss (ACL) for SVSS, aiming to fully utilize the temporal relations of internal frames in video clip. The authors' TR method implements the reconstruction from the feature and output levels to narrow the distribution gap between internal video frames. Specifically, considering the underlying data distribution, the authors construct Gaussian models for each category, and use probability density function to obtain the similarity between different feature maps for temporal feature reconstruction. The authors' ACL can adaptively select two pixel-wise consistency loss including Flow Consistency Loss and Reconstruction Consistency Loss, providing stronger supervision signals for unlabelled frames during model training. Additionally, the authors extend their method to unlabelled video for more training data by employing mean-teacher structure. Extensive experiments on three datasets including Cityscapes, Camvid and VSPW demonstrate that the authors' proposed method outperforms previous state-of-the-art methods. To fully utilize a large amount of unlabelled video frames in video semantic segmentation, the authors employ semi-supervised learning based approach to implement the video semantic segmentation. This paper presents a novel method (named TRACL) with temporal reconstruction (TR) and adaptive consistency loss (ACL) for video semantic segmentation, aiming to leverage the temporal relations of video frames.The authors' TR method implements the reconstruction from the feature and output levels to narrow the distribution gap between internal video frames, while their ACL can adaptively select pixel-wise consistency loss including Flow Consistency Loss and Reconstruction Consistency Loss, providing stronger supervision signals for unlabelled frames during model training. The proposed method achieves state-of-the-art segmentation performance on Cityscapes and Camvid.image
引用
收藏
页码:348 / 361
页数:14
相关论文
共 50 条
  • [1] Revisiting Consistency for Semi-Supervised Semantic Segmentation
    Grubisic, Ivan
    Orsic, Marin
    Segvic, Sinisa
    [J]. SENSORS, 2023, 23 (02)
  • [2] SEMI-SUPERVISED SEMANTIC SEGMENTATION CONSTRAINED BY CONSISTENCY REGULARIZATION
    Li, Xiaoqiang
    He, Qin
    Dai, Songmin
    Wu, Pin
    Tong, Weiqin
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
  • [3] Semi-Supervised Video Semantic Segmentation with Inter-Frame Feature Reconstruction
    Zhuang, Jiafan
    Wang, Zilei
    Gao, Yuan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 3253 - 3261
  • [4] Semi-supervised Semantic Segmentation with Directional Context-aware Consistency
    Lai, Xin
    Tian, Zhuotao
    Jiang, Li
    Liu, Shu
    Zhao, Hengshuang
    Wang, Liwei
    Jia, Jiaya
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 1205 - 1214
  • [5] Semi-supervised Semantic Segmentation with Prototype-based Consistency Regularization
    Xu, Hai-Ming
    Liu, Lingqiao
    Bian, Qiuchen
    Yang, Zhen
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [6] Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation
    Yulin Wu
    Chang Liu
    Lei Chen
    Dong Zhao
    Qinghe Zheng
    Hongchao Zhou
    [J]. Multimedia Systems, 2023, 29 : 511 - 523
  • [7] Perturbation consistency and mutual information regularization for semi-supervised semantic segmentation
    Wu, Yulin
    Liu, Chang
    Chen, Lei
    Zhao, Dong
    Zheng, Qinghe
    Zhou, Hongchao
    [J]. MULTIMEDIA SYSTEMS, 2023, 29 (02) : 511 - 523
  • [8] View-coherent correlation consistency for semi-supervised semantic segmentation
    Hou, Yunzhong
    Gould, Stephen
    Zheng, Liang
    [J]. PATTERN RECOGNITION, 2024, 147
  • [9] Revisiting Weak-to-Strong Consistency in Semi-Supervised Semantic Segmentation
    Yang, Lihe
    Qi, Lei
    Feng, Litong
    Zhang, Wayne
    Shi, Yinghuan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 7236 - 7246
  • [10] Semi-Supervised Semantic Segmentation via Adaptive Equalization Learning
    Hu, Hanzhe
    Wei, Fangyun
    Hu, Han
    Ye, Qiwei
    Cui, Jinshi
    Wang, Liwei
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34