Token Contrast for Weakly-Supervised Semantic Segmentation

被引:29
|
作者
Ru, Lixiang [1 ,2 ,3 ]
Zheng, Hehang [3 ]
Zhan, Yibing [3 ]
Du, Bo [1 ,2 ]
机构
[1] Wuhan Univ, Sch Comp Sci, Inst Artificial Intelligence, Natl Engn Res Ctr Multimedia Software, Wuhan, Peoples R China
[2] Wuhan Univ, Hubei Key Lab Multimedia & Network Commun Engn, Wuhan, Peoples R China
[3] JD Explore Acad, Beijing, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/CVPR52729.2023.00302
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Weakly-Supervised Semantic Segmentation (WSSS) using image-level labels typically utilizes Class Activation Map (CAM) to generate the pseudo labels. Limited by the local structure perception of CNN, CAM usually cannot identify the integral object regions. Though the recent Vision Transformer (ViT) can remedy this flaw, we observe it also brings the over-smoothing issue, i.e., the final patch tokens incline to be uniform. In this work, we propose Token Contrast (ToCo) to address this issue and further explore the virtue of ViT for WSSS. Firstly, motivated by the observation that intermediate layers in ViT can still retain semantic diversity, we designed a Patch Token Contrast module (PTC). PTC supervises the final patch tokens with the pseudo token relations derived from intermediate layers, allowing them to align the semantic regions and thus yield more accurate CAM. Secondly, to further differentiate the low-confidence regions in CAM, we devised a Class Token Contrast module (CTC) inspired by the fact that class tokens in ViT can capture high-level semantics. CTC facilitates the representation consistency between uncertain local regions and global objects by contrasting their class tokens. Experiments on the PASCAL VOC and MS COCO datasets show the proposed ToCo can remarkably surpass other single-stage competitors and achieve comparable performance with state-of-the-art multi-stage methods. Code is available at https://github.com/rulixiang/ToCo.
引用
收藏
页码:3093 / 3102
页数:10
相关论文
共 50 条
  • [1] A Weakly-Supervised Approach for Semantic Segmentation
    Feng, Yanqing
    Wang, Lunwen
    [J]. PROCEEDINGS OF 2019 IEEE 3RD INFORMATION TECHNOLOGY, NETWORKING, ELECTRONIC AND AUTOMATION CONTROL CONFERENCE (ITNEC 2019), 2019, : 2311 - 2314
  • [2] Rethinking CAM in Weakly-Supervised Semantic Segmentation
    Song, Yuqi
    Li, Xiaojie
    Shi, Canghong
    Feng, Shihao
    Wang, Xin
    Luo, Yong
    Xi, Wu
    [J]. IEEE ACCESS, 2022, 10 : 126440 - 126450
  • [3] Discriminative Region Suppression for Weakly-Supervised Semantic Segmentation
    Kim, Beomyoung
    Han, Sangeun
    Kim, Junmo
    [J]. THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 1754 - 1761
  • [4] Weakly-Supervised Dual Clustering for Image Semantic Segmentation
    Liu, Yang
    Liu, Jing
    Li, Zechao
    Tang, Jinhui
    Lu, Hanqing
    [J]. 2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 2075 - 2082
  • [5] Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
    Xiang Wang
    Sifei Liu
    Huimin Ma
    Ming-Hsuan Yang
    [J]. International Journal of Computer Vision, 2020, 128 : 1736 - 1749
  • [6] Weakly-Supervised Semantic Segmentation Network With Iterative dCRF
    Li, Yujie
    Sun, Jiaxing
    Li, Yun
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (12) : 25419 - 25426
  • [7] Weakly-Supervised Semantic Segmentation by Iterative Affinity Learning
    Wang, Xiang
    Liu, Sifei
    Ma, Huimin
    Yang, Ming-Hsuan
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2020, 128 (06) : 1736 - 1749
  • [8] Learning Visual Words for Weakly-Supervised Semantic Segmentation
    Ru, Lixiang
    Du, Bo
    Wu, Chen
    [J]. PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 982 - 988
  • [9] Expansion and Shrinkage of Localization for Weakly-Supervised Semantic Segmentation
    Li, Jinlong
    Jie, Zequn
    Wang, Xu
    Wei, Xiaolin
    Ma, Lin
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [10] Weakly-Supervised Semantic Segmentation Using Motion Cues
    Tokmakov, Pavel
    Alahari, Karteek
    Schmid, Cordelia
    [J]. COMPUTER VISION - ECCV 2016, PT IV, 2016, 9908 : 388 - 404