WeClick: Weakly-Supervised Video Semantic Segmentation with Click Annotations

被引:4
|
作者
Liu, Peidong [1 ]
He, Zibin [1 ]
Yan, Xiyu [1 ]
Jiang, Yong [1 ,2 ]
Xia, Shu-Tao [1 ,2 ]
Zheng, Feng [3 ]
Hu, Maowei [1 ,4 ]
机构
[1] Tsinghua Univ, Tsinghua Shenzhen Int Grad Sch, Beijing, Peoples R China
[2] Peng Cheng Lab, PCL Res Ctr Networks & Commun, Beijing, Peoples R China
[3] Southern Univ Sci & Technol, Dept Comp Sci & Engn, Shenzhen, Peoples R China
[4] Shenzhen Rejoice Sport Tech Co Ltd, Shenzhen, Peoples R China
关键词
video semantic segmentation; weakly-supervised learning; click annotations; knowledge distillation; CUT;
D O I
10.1145/3474085.3475217
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compared with tedious per-pixel mask annotating, it is much easier to annotate data by clicks, which costs only several seconds for an image. However, applying clicks to learn video semantic segmentation model has not been explored before. In this work, we propose an effective weakly-supervised video semantic segmentation pipeline with click annotations, called WeClick, for saving laborious annotating effort by segmenting an instance of the semantic class with only a single click. Since detailed semantic information is not captured by clicks, directly training with click labels leads to poor segmentation predictions. To mitigate this problem, we design a novel memory flow knowledge distillation strategy to exploit temporal information (named memory flow) in abundant unlabeled video frames, by distilling the neighboring predictions to the target frame via estimated motion. Moreover, we adopt vanilla knowledge distillation for model compression. In this case, WeClick learns compact video semantic segmentation models with the low-cost click annotations during the training phase yet achieves real-time and accurate models during the inference period. Experimental results on Cityscapes and Camvid show that WeClick outperforms the state-of-the-art methods, increases performance by 10.24% mIoU than baseline, and achieves real-time execution.
引用
收藏
页码:2995 / 3004
页数:10
相关论文
共 50 条
  • [1] Weakly-Supervised Ultrasound Video Segmentation with Minimal Annotations
    Chang, Ruiheng
    Wang, Dong
    Guo, Haiyan
    Ding, Jia
    Wang, Liwei
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT VIII, 2021, 12908 : 648 - 658
  • [2] GraphNet: Learning Image Pseudo Annotations for Weakly-Supervised Semantic Segmentation
    Pu, Mengyang
    Huang, Yaping
    Guan, Qingji
    Zou, Qi
    [J]. PROCEEDINGS OF THE 2018 ACM MULTIMEDIA CONFERENCE (MM'18), 2018, : 483 - 491
  • [3] A Weakly-Supervised Approach for Semantic Segmentation
    Feng, Yanqing
    Wang, Lunwen
    [J]. PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2311 - 2314
  • [4] Token Contrast for Weakly-Supervised Semantic Segmentation
    Ru, Lixiang
    Zheng, Hehang
    Zhan, Yibing
    Du, Bo
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3093 - 3102
  • [5] Rethinking CAM in Weakly-Supervised Semantic Segmentation
    Song, Yuqi
    Li, Xiaojie
    Shi, Canghong
    Feng, Shihao
    Wang, Xin
    Luo, Yong
    Xi, Wu
    [J]. IEEE ACCESS, 2022, 10 : 126440 - 126450
  • [6] Weakly-Supervised RGBD Video Object Segmentation
    Yang, Jinyu
    Gao, Mingqi
    Zheng, Feng
    Zhen, Xiantong
    Ji, Rongrong
    Shao, Ling
    Leonardis, Ales
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 2158 - 2170
  • [7] Weakly-supervised structural component segmentation via scribble annotations
    Zhang, Chenyu
    Li, Ke
    Yin, Zhaozheng
    Qin, Ruwen
    [J]. COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2024,
  • [8] Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation
    Kim, Beomyoung
    Han, Sangeun
    Kim, Junmo
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1754 - 1761
  • [9] Weakly-Supervised Dual Clustering for Image Semantic Segmentation
    Liu, Yang
    Liu, Jing
    Li, Zechao
    Tang, Jinhui
    Lu, Hanqing
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2075 - 2082
  • [10] Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
    Xiang Wang
    Sifei Liu
    Huimin Ma
    Ming-Hsuan Yang
    [J]. International Journal of Computer Vision, 2020, 128 : 1736 - 1749