Adaptive Agent Transformer for Few-Shot Segmentation

被引:18
|
作者
Wang, Yuan [1 ]
Sun, Rui [1 ]
Zhang, Zhe [3 ,4 ,5 ]
Zhang, Tianzhu [1 ,2 ,5 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
[2] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
[3] Beijing Inst Technol, Beijing, Peoples R China
[4] CNSA, Lunar Explorat & Space Engn Ctr, Beijing, Peoples R China
[5] Deep Space Explorat Lab, Beijing, Peoples R China
来源
关键词
Few-shot segmentation; Semantic segmentation; Transformer;
D O I
10.1007/978-3-031-19818-2_3
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Few-shot segmentation (FSS) aims to segment objects in a given query image with only a few labelled support images. The limited support information makes it an extremely challenging task. Most previous best-performing methods adopt prototypical learning or affinity learning. Nevertheless, they either neglect to further utilize support pixels for facilitating segmentation and lose spatial information, or are not robust to noisy pixels and computationally expensive. In this work, we propose a novel end-to-end adaptive agent transformer (AAFormer) to integrate prototypical and affinity learning to exploit the complementarity between them via a transformer encoder-decoder architecture, including a representation encoder, an agent learning decoder and an agent matching decoder. The proposed AAFormer enjoys several merits. First, to learn agent tokens well without any explicit supervision, and to make agent tokens capable of dividing different objects into diverse parts in an adaptive manner, we customize the agent learning decoder according to the three characteristics of context awareness, spatial awareness and diversity. Second, the proposed agent matching decoder is responsible for decomposing the direct pixel-level matching matrix into two more computationally-friendly matrices to suppress the noisy pixels. Extensive experimental results on two standard benchmarks demonstrate that our AAFormer performs favorably against state-of-the-art FSS methods.
引用
收藏
页码:36 / 52
页数:17
相关论文
共 50 条
  • [1] AgMTR: Agent Mining Transformer for Few-Shot Segmentation in Remote Sensing
    Bi, Hanbo
    Feng, Yingchao
    Mao, Yongqiang
    Pei, Jianning
    Diao, Wenhui
    Wang, Hongqi
    Sun, Xian
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024,
  • [2] Dynamic Transformer for Few-shot Instance Segmentation
    Wang, Haochen
    Liu, Jie
    Liu, Yongtuo
    Maji, Subhransu
    Sonke, Jan-Jakob
    Gavves, Efstratios
    [J]. PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2022, 2022, : 2969 - 2977
  • [3] Mask Matching Transformer for Few-Shot Segmentation
    Jiao, Siyu
    Zhang, Gengwei
    Navasardyan, Shant
    Chen, Ling
    Zhao, Yao
    Wei, Yunchao
    Shi, Humphrey
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [4] A lightweight siamese transformer for few-shot semantic segmentation
    Zhu, Hegui
    Zhou, Yange
    Jiang, Cong
    Yang, Lianping
    Jiang, Wuming
    Wang, Zhimu
    [J]. NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7455 - 7469
  • [5] A lightweight siamese transformer for few-shot semantic segmentation
    Hegui Zhu
    Yange Zhou
    Cong Jiang
    Lianping Yang
    Wuming Jiang
    Zhimu Wang
    [J]. Neural Computing and Applications, 2024, 36 : 7455 - 7469
  • [6] Few-Shot Segmentation via Cycle-Consistent Transformer
    Zhang, Gengwei
    Kang, Guoliang
    Yang, Yi
    Wei, Yunchao
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [7] Focus on Query: Adversarial Mining Transformer for Few-Shot Segmentation
    Wang, Yuan
    Luo, Naisong
    Zhang, Tianzhu
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [8] AMP: Adaptive Masked Proxies for Few-Shot Segmentation
    Siam, Mennatullah
    Oreshkin, Boris N.
    Jagersand, Martin
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 5248 - 5257
  • [9] Adaptive Prototype Learning and Allocation for Few-Shot Segmentation
    Li, Gen
    Jampani, Varun
    Sevilla-Lara, Laura
    Sun, Deqing
    Kim, Jonghyun
    Kim, Joongkyu
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 8330 - 8339
  • [10] Simpler is Better: Few-shot Semantic Segmentation with Classifier Weight Transformer
    Lu, Zhihe
    He, Sen
    Zhu, Xiatian
    Zhang, Li
    Song, Yi-Zhe
    Xiang, Tao
    [J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 8721 - 8730