Referring Image Segmentation Using Text Supervision

被引:0
|
作者
Liu, Fang [1 ,2 ]
Liu, Yuhao [2 ]
Kong, Yuqiu [1 ]
Xu, Ke [2 ]
Zhang, Lihe [1 ]
Yin, Baocai [1 ]
Hancke, Gerhard [2 ]
Lau, Rynson [2 ]
机构
[1] Dalian Univ Technol, Dalian, Liaoning, Peoples R China
[2] City Univ Hong Kong, Hong Kong, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
D O I
10.1109/ICCV51070.2023.02022
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing Referring Image Segmentation (RIS) methods typically require expensive pixel-level or box-level annotations for supervision. In this paper, we observe that the referring texts used in RIS already provide sufficient information to localize the target object. Hence, we propose a novel weakly-supervised RIS framework to formulate the target localization problem as a classification process to differentiate between positive and negative text expressions. While the referring text expressions for an image are used as positive expressions, the referring text expressions from other images can be used as negative expressions for this image. Our framework has three main novelties. First, we propose a bilateral prompt method to facilitate the classification process, by harmonizing the domain discrepancy between visual and linguistic features. Second, we propose a calibration method to reduce noisy background information and improve the correctness of the response maps for target object localization. Third, we propose a positive response map selection strategy to generate highquality pseudo-labels from the enhanced response maps, for training a segmentation network for RIS inference. For evaluation, we propose a new metric to measure localization accuracy. Experiments on four benchmarks show that our framework achieves promising performances to existing fully-supervised RIS methods while outperforming state-ofthe-art weakly-supervised methods adapted from related areas. Code is available at https://github.com/fawnliu/TRIS.
引用
收藏
页码:22067 / 22077
页数:11
相关论文
共 50 条
  • [1] Shatter and Gather: Learning Referring Image Segmentation with Text Supervision
    Kim, Dongwon
    Kim, Namyup
    Lan, Cuiling
    Kwak, Suha
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 15501 - 15511
  • [2] Text-Vision Relationship Alignment for Referring Image Segmentation
    Pu, Mingxing
    Luo, Bing
    Zhang, Chao
    Xu, Li
    Xu, Fayou
    Kong, Mingming
    [J]. NEURAL PROCESSING LETTERS, 2024, 56 (02)
  • [3] See-Through-Text Grouping for Referring Image Segmentation
    Chen, Ding-Jie
    Jia, Songhao
    Lo, Yi-Chen
    Chen, Hwann-Tzong
    Liu, Tyng-Luh
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7453 - 7462
  • [4] Text-Vision Relationship Alignment for Referring Image Segmentation
    Mingxing Pu
    Bing Luo
    Chao Zhang
    Li Xu
    Fayou Xu
    Mingming Kong
    [J]. Neural Processing Letters, 56
  • [5] GENERATIVE ADVERSARIAL NETWORK INCLUDING REFERRING IMAGE SEGMENTATION FOR TEXT-GUIDED IMAGE MANIPULATION
    Watanabe, Yuto
    Togo, Ren
    Maeda, Keisuke
    Ogawa, Takahiro
    Haseyama, Miki
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 4818 - 4822
  • [6] Image Segmentation Using Text and Image Prompts
    Lueddecke, Timo
    Ecker, Alexander
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 7076 - 7086
  • [7] Hierarchical collaboration for referring image segmentation
    Zhang, Wei
    Cheng, Zesen
    Chen, Jie
    Gao, Wen
    [J]. Neurocomputing, 2025, 613
  • [8] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    [J]. IEEE Transactions on Image Processing, 2024, 33 : 1782 - 1794
  • [9] Toward Robust Referring Image Segmentation
    Wu, Jianzong
    Li, Xiangtai
    Li, Xia
    Ding, Henghui
    Tong, Yunhai
    Tao, Dacheng
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2024, 33 : 1782 - 1794
  • [10] Handwritten text segmentation using blurred image
    Lemaitre, Aurelie
    Camillerapp, Jean
    Coueasnon, Bertrand
    [J]. DOCUMENT RECOGNITION AND RETRIEVAL XXI, 2014, 9021