Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation and Beyond

被引:0
|
作者
Chen, Shuai [1 ]
Meng, Fanman [1 ]
Zhang, Runtong [1 ]
Qiu, Heqian [1 ]
Li, Hongliang [1 ]
Wu, Qingbo [1 ]
Xu, Linfeng [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Informat & Commun Engn, Chengdu 611731, Peoples R China
基金
国家重点研发计划;
关键词
Task analysis; Visualization; Image segmentation; Annotations; Prototypes; Adaptation models; Training; Few-shot segmentation; zero-shot; any-shot; class-agnostic; CLIP; AGGREGATION;
D O I
10.1109/TMM.2024.3361181
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Few-shot segmentation (FSS) aims to segment the novel class with a few annotated images. Due to CLIP's advantages of aligning visual and textual information, the integration of CLIP can enhance the generalization ability of FSS model. However, even with the CLIP model, the existing CLIP-based FSS methods are still subject to the biased prediction towards base class, which is caused by the class-specific feature level interactions. To solve this issue, we propose a visual and textual Prior Guided Mask Assemble Network (PGMA-Net). It employs a class-agnostic mask assembly process to alleviate the bias, and formulates diverse tasks into a unified manner by assembling the prior through affinity. Specifically, the class-relevant textual and visual features are first transformed to class-agnostic prior in the form of probability map. Then, a Prior-Guided Mask Assemble Module (PGMAM) including multiple General Assemble Units (GAUs) is introduced. It considers diverse and plug-and-play interactions, such as visual-textual, inter- and intra-image, training-free, and high-order ones. Lastly, to ensure the class-agnostic ability, a Hierarchical Decoder with Channel-Drop Mechanism (HDCDM) is proposed to flexibly exploit the assembled masks and low-level features, without relying on any class-specific information. It achieves new state-of-the-art results in the FSS task, with mIoU of 77.6 on PASCAL-5(i) and 59.4 on COCO-20(i) in 1-shot scenario. Beyond this, we show that without extra re-training, the proposed PGMA-Net can solve bbox-level and cross-domain FSS, co-segmentation, zero-shot segmentation (ZSS) tasks, leading an any-shot segmentation framework capable of accommodating diverse weak or pixel annotations.
引用
收藏
页码:7197 / 7209
页数:13
相关论文
共 50 条
  • [1] A Prior-mask-guided Few-shot Learning for Skin Lesion Segmentation
    Xiao, Junsheng
    Xu, Huahu
    Zhao, Wei
    Cheng, Chen
    Gao, HongHao
    COMPUTING, 2023, 105 (03) : 717 - 739
  • [2] A Prior-mask-guided Few-shot Learning for Skin Lesion Segmentation
    Junsheng Xiao
    Huahu Xu
    Wei Zhao
    Chen Cheng
    HongHao Gao
    Computing, 2023, 105 : 717 - 739
  • [3] RPMG-FSS: Robust Prior Mask Guided Few-Shot Semantic Segmentation
    Zhang, Lingling
    Zhang, Xinyu
    Wang, Qianying
    Wu, Wenjun
    Chang, Xiaojun
    Liu, Jun
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (11) : 6609 - 6621
  • [4] Prior Guided Feature Enrichment Network for Few-Shot Segmentation
    Tian, Zhuotao
    Zhao, Hengshuang
    Shu, Michelle
    Yang, Zhicheng
    Li, Ruiyu
    Jia, Jiaya
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (02) : 1050 - 1065
  • [5] Hierarchical bidirectional aggregation with prior guided transformer for few-shot segmentation
    Qiuyu Kong
    Jie Jiang
    Junyan Yang
    Qi Wang
    International Journal of Multimedia Information Retrieval, 2023, 12
  • [6] Hierarchical bidirectional aggregation with prior guided transformer for few-shot segmentation
    Kong, Qiuyu
    Jiang, Jie
    Yang, Junyan
    Wang, Qi
    INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, 2023, 12 (02)
  • [7] Mask Matching Transformer for Few-Shot Segmentation
    Jiao, Siyu
    Zhang, Gengwei
    Navasardyan, Shant
    Chen, Ling
    Zhao, Yao
    Wei, Yunchao
    Shi, Humphrey
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [8] MASK-GUIDED ATTENTION AND EPISODE ADAPTIVE WEIGHTS FOR FEW-SHOT SEGMENTATION
    Kwon, Hyeongjun
    Song, Taeyong
    Kim, Sunok
    Sohn, Kwanghoon
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 2611 - 2615
  • [9] Prototype-Guided Prior Enhancement and Rectification in Few-shot Semantic Segmentation
    Tang, Yiming
    Yu, Yi
    Chen, Yan Qiu
    2024 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO, ICME 2024, 2024,
  • [10] Few-Shot Semantic Segmentation via Mask Aggregation
    Ao, Wei
    Zheng, Shunyi
    Meng, Yan
    Yang, Yang
    NEURAL PROCESSING LETTERS, 2024, 56 (02)