SSformer: A Lightweight Transformer for Semantic Segmentation

被引：11

作者：

Shi, Wentao ^{[1
]}

Xu, Jing ^{[1
]}

Gao, Pan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年

关键词：

Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;

D O I：

10.1109/MMSP55362.2022.9949177

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer

引用

页数：5

共 50 条

[1] A lightweight siamese transformer for few-shot semantic segmentation
Zhu, Hegui
Zhou, Yange
Jiang, Cong
Yang, Lianping
Jiang, Wuming
Wang, Zhimu
NEURAL COMPUTING & APPLICATIONS, 2024, 36 (13): : 7455 - 7469
[2] Head-Free Lightweight Semantic Segmentation with Linear Transformer
Dong, Bo
Wang, Pichao
Wang, Fan
THIRTY-SEVENTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 37 NO 1, 2023, : 516 - 524
[3] A lightweight siamese transformer for few-shot semantic segmentation
Hegui Zhu
Yange Zhou
Cong Jiang
Lianping Yang
Wuming Jiang
Zhimu Wang
Neural Computing and Applications, 2024, 36 : 7455 - 7469
[4] LSTFormer:Lightweight Semantic Segmentation Network Based on Swin Transformer
Yang, Cheng
Gao, Jianlin
Zheng, Meilin
Ding, Rong
Computer Engineering and Applications, 2023, 59 (12) : 166 - 175
[5] Lightweight Real-Time Semantic Segmentation Network With Efficient Transformer and CNN
Xu, Guoan
Li, Juncheng
Gao, Guangwei
Lu, Huimin
Yang, Jian
Yue, Dong
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2023, 24 (12) : 15897 - 15906
[6] CLIP for Lightweight Semantic Segmentation
Jin, Ke
Yang, Wankou
PATTERN RECOGNITION AND COMPUTER VISION, PRCV 2023, PT X, 2024, 14434 : 323 - 333
[7] ELiFormer: A hierarchical Transformer based Model with Efficient Encoder and Lightweight Decoder for Semantic Segmentation
Wu, Zixuan
Zhou, Yue
2024 2ND ASIA CONFERENCE ON COMPUTER VISION, IMAGE PROCESSING AND PATTERN RECOGNITION, CVIPPR 2024, 2024,
[8] TrSeg: Transformer for semantic segmentation
Jin, Youngsaeng
Han, David
Ko, Hanseok
PATTERN RECOGNITION LETTERS, 2021, 148 : 29 - 35
[9] Segmenter: Transformer for Semantic Segmentation
Strudel, Robin
Garcia, Ricardo
Laptev, Ivan
Schmid, Cordelia
2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7242 - 7252
[10] Lightweight convolutional neural networks with context broadcast transformer for real-time semantic segmentation
Hu, Kaidi
Xie, Zongxia
Hu, Qinghua
IMAGE AND VISION COMPUTING, 2024, 146

← 1 2 3 4 5 →