SSformer: A Lightweight Transformer for Semantic Segmentation

被引:11
|
作者
Shi, Wentao [1 ]
Xu, Jing [1 ]
Gao, Pan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
关键词
Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;
D O I
10.1109/MMSP55362.2022.9949177
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer
引用
收藏
页数:5
相关论文
共 50 条
  • [1] A lightweight siamese transformer for few-shot semantic segmentation
    Zhu, Hegui
    Zhou, Yange
    Jiang, Cong
    Yang, Lianping
    Jiang, Wuming
    Wang, Zhimu
    NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7455 - 7469
  • [2] Head-Free Lightweight Semantic Segmentation with Linear Transformer
    Dong, Bo
    Wang, Pichao
    Wang, Fan
    THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 516 - 524
  • [3] A lightweight siamese transformer for few-shot semantic segmentation
    Hegui Zhu
    Yange Zhou
    Cong Jiang
    Lianping Yang
    Wuming Jiang
    Zhimu Wang
    Neural Computing and Applications, 2024, 36 : 7455 - 7469
  • [4] LSTFormer:Lightweight Semantic Segmentation Network Based on Swin Transformer
    Yang, Cheng
    Gao, Jianlin
    Zheng, Meilin
    Ding, Rong
    Computer Engineering and Applications, 2023, 59 (12) : 166 - 175
  • [5] Lightweight Real-Time Semantic Segmentation Network With Efficient Transformer and CNN
    Xu, Guoan
    Li, Juncheng
    Gao, Guangwei
    Lu, Huimin
    Yang, Jian
    Yue, Dong
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 15897 - 15906
  • [6] CLIP for Lightweight Semantic Segmentation
    Jin, Ke
    Yang, Wankou
    PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 323 - 333
  • [7] ELiFormer: A hierarchical Transformer based Model with Efficient Encoder and Lightweight Decoder for Semantic Segmentation
    Wu, Zixuan
    Zhou, Yue
    2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
  • [8] TrSeg: Transformer for semantic segmentation
    Jin, Youngsaeng
    Han, David
    Ko, Hanseok
    PATTERN RECOGNITION LETTERS, 2021, 148 : 29 - 35
  • [9] Segmenter: Transformer for Semantic Segmentation
    Strudel, Robin
    Garcia, Ricardo
    Laptev, Ivan
    Schmid, Cordelia
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7242 - 7252
  • [10] Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation
    Hu, Kaidi
    Xie, Zongxia
    Hu, Qinghua
    IMAGE AND VISION COMPUTING, 2024, 146