ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation

被引:0
|
作者
Jeong, Seonggyun [1 ]
Heo, Yong Seok [1 ,2 ]
机构
[1] Ajou Univ, Dept Artificial Intelligence, Suwon 16499, South Korea
[2] Ajou Univ, Dept Elect & Comp Engn, Suwon 16499, South Korea
基金
新加坡国家研究基金会;
关键词
semantic segmentation; neural collapse; class imbalance; transformer;
D O I
10.3390/s24216913
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular Tight Frame (ETF). While the use of ETF aids in enhancing the discriminability of minor classes, class correlation is another crucial factor that must be taken into account. However, managing the balance between class correlation and discrimination through neural collapse remains challenging, as these properties inherently conflict with one another. Moreover, this control is established during the training stage, resulting in a fixed classifier. There is no guarantee that this classifier will consistently perform well with different input images. To address this problem, we propose an Equiangular Tight Frame Transformer (ETFT), a transformer-based model that jointly processes the features and classifier using ETF structure, and dynamically generates the classifier as a function of the input for imbalanced semantic segmentation. Specifically, the classifier initialized with the ETF structure is jointly processed with the input patch tokens during the attention process. As a result, the transformed patch tokens, aided by the ETF structure, achieve discriminability between classes while preserving contextual correlation. The classifier, initially structured as an ETF, is adjusted to incorporate the correlation information, benefiting from the attention mechanism. Furthermore, the learned classifier is combined with the fixed ETF classifier, leveraging the advantages of both. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods for imbalanced semantic segmentation on both the ADE20K and Cityscapes datasets.
引用
收藏
页数:21
相关论文
共 50 条
  • [41] Robust loss function for class imbalanced semantic segmentation and image classification
    Bhat, S. Divakar
    Amit, More
    Soni, Mudit
    Yasui, Yuji
    IFAC PAPERSONLINE, 2023, 56 (02): : 7934 - 7939
  • [42] TransUNet with unified focal loss for class-imbalanced semantic segmentation
    Kento Wakamatsu
    Satoshi Ono
    Artificial Life and Robotics, 2024, 29 : 101 - 106
  • [43] TransUNet with unified focal loss for class-imbalanced semantic segmentation
    Wakamatsu, Kento
    Ono, Satoshi
    ARTIFICIAL LIFE AND ROBOTICS, 2024, 29 (01) : 101 - 106
  • [44] Evaluating Transformer-based Semantic Segmentation Networks for Pathological Image Segmentation
    Cam Nguyen
    Asad, Zuhayr
    Deng, Ruining
    Huo, Yuankai
    MEDICAL IMAGING 2022: IMAGE PROCESSING, 2022, 12032
  • [45] A lightweight siamese transformer for few-shot semantic segmentation
    Zhu, Hegui
    Zhou, Yange
    Jiang, Cong
    Yang, Lianping
    Jiang, Wuming
    Wang, Zhimu
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7455 - 7469
  • [46] Cross-scale sampling transformer for semantic image segmentation
    Ma, Yizhe
    Yu, Long
    Lin, Fangjian
    Tian, Shengwei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 44 (02) : 2895 - 2907
  • [47] Enhancing Semantically Masked Transformer With Local Attention for Semantic Segmentation
    Xia, Zhengyu
    Kim, Joohee
    IEEE ACCESS, 2023, 11 : 122345 - 122356
  • [48] An Enhanced Downsampling Transformer Network for Point Cloud Semantic Segmentation
    Wang, Yang
    Wei, Zixuan
    Wan, Zhibo
    ARTIFICIAL INTELLIGENCE AND ROBOTICS, ISAIR 2023, 2024, 1998 : 262 - 269
  • [49] TBFormer: three-branch efficient transformer for semantic segmentation
    Wei, Can
    Wei, Yan
    SIGNAL IMAGE AND VIDEO PROCESSING, 2024, 18 (04) : 3661 - 3672
  • [50] HGFormer: Hierarchical Grouping Transformer for Domain Generalized Semantic Segmentation
    Ding, Jian
    Xue, Nan
    Xia, Gui-Song
    Schiele, Bernt
    Dai, Dengxin
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 15413 - 15423