ETFT: Equiangular Tight Frame Transformer for Imbalanced Semantic Segmentation

被引:0
|
作者
Jeong, Seonggyun [1 ]
Heo, Yong Seok [1 ,2 ]
机构
[1] Ajou Univ, Dept Artificial Intelligence, Suwon 16499, South Korea
[2] Ajou Univ, Dept Elect & Comp Engn, Suwon 16499, South Korea
基金
新加坡国家研究基金会;
关键词
semantic segmentation; neural collapse; class imbalance; transformer;
D O I
10.3390/s24216913
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Semantic segmentation often suffers from class imbalance, where the label ratio for each class in the dataset is not uniform. Recent studies have addressed the issue of class imbalance in semantic segmentation by leveraging the neural collapse phenomenon in conjunction with an Equiangular Tight Frame (ETF). While the use of ETF aids in enhancing the discriminability of minor classes, class correlation is another crucial factor that must be taken into account. However, managing the balance between class correlation and discrimination through neural collapse remains challenging, as these properties inherently conflict with one another. Moreover, this control is established during the training stage, resulting in a fixed classifier. There is no guarantee that this classifier will consistently perform well with different input images. To address this problem, we propose an Equiangular Tight Frame Transformer (ETFT), a transformer-based model that jointly processes the features and classifier using ETF structure, and dynamically generates the classifier as a function of the input for imbalanced semantic segmentation. Specifically, the classifier initialized with the ETF structure is jointly processed with the input patch tokens during the attention process. As a result, the transformed patch tokens, aided by the ETF structure, achieve discriminability between classes while preserving contextual correlation. The classifier, initially structured as an ETF, is adjusted to incorporate the correlation information, benefiting from the attention mechanism. Furthermore, the learned classifier is combined with the fixed ETF classifier, leveraging the advantages of both. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods for imbalanced semantic segmentation on both the ADE20K and Cityscapes datasets.
引用
收藏
页数:21
相关论文
共 50 条
  • [1] EQUIANGULAR TIGHT FRAME FINGERPRINTING CODES
    Mixon, Dustin G.
    Quinn, Christopher
    Kiyavash, Negar
    Fickus, Matthew
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 1856 - 1859
  • [2] Early Exit with Disentangled Representation and Equiangular Tight Frame
    Jil, Yixin
    Wang, Jikai
    Li, Juntao
    Chen, Qiang
    Chen, Wenliang
    Zhang, Min
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023), 2023, : 14128 - 14142
  • [3] Incoherent and Robust Projection Matrix Design Based on Equiangular Tight Frame
    Meenakshi
    Srirangarajan, Seshan
    IEEE ACCESS, 2021, 9 : 131462 - 131475
  • [4] TrSeg: Transformer for semantic segmentation
    Jin, Youngsaeng
    Han, David
    Ko, Hanseok
    PATTERN RECOGNITION LETTERS, 2021, 148 : 29 - 35
  • [5] Segmenter: Transformer for Semantic Segmentation
    Strudel, Robin
    Garcia, Ricardo
    Laptev, Ivan
    Schmid, Cordelia
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7242 - 7252
  • [6] Panoptic segmentation with highly imbalanced semantic labels
    Rumberger, Josef Lorenz
    Baumann, Elias
    Hirsch, Peter
    Janowczyk, Andrew
    Zlobec, Inti
    Kainmueller, Dagmar
    2022 IEEE INTERNATIONAL SYMPOSIUM ON BIOMEDICAL IMAGING CHALLENGES (IEEE ISBI 2022), 2022,
  • [7] Transformer Scale Gate for Semantic Segmentation
    Shi, Hengcan
    Hayat, Munawar
    Cai, Jianfei
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 3051 - 3060
  • [8] TransRVNet: LiDAR Semantic Segmentation With Transformer
    Cheng, Hui-Xian
    Han, Xian-Feng
    Xiao, Guo-Qiang
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (06) : 5895 - 5907
  • [9] Pyramid Fusion Transformer for Semantic Segmentation
    Qin, Zipeng
    Liu, Jianbo
    Zhang, Xiaolin
    Tian, Maoqing
    Zhou, Aojun
    Yi, Shuai
    Li, Hongsheng
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 9630 - 9643
  • [10] SSformer: A Lightweight Transformer for Semantic Segmentation
    Shi, Wentao
    Xu, Jing
    Gao, Pan
    2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP), 2022,