SSformer: A Lightweight Transformer for Semantic Segmentation

被引：11

作者：

Shi, Wentao ^{[1
]}

Xu, Jing ^{[1
]}

Gao, Pan ^{[1
]}

机构：

[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China

来源：

2022 IEEE 24TH INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP) | 2022年

关键词：

Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;

D O I：

10.1109/MMSP55362.2022.9949177

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer

引用

页数：5

共 50 条

[41] Lightweight Semantic Segmentation of Road Scenes for Autonomous Driving
Li, Shunxin
Wu, Tong
Computer Engineering and Applications, 2023, 59 (19) : 177 - 183
[42] A Lightweight and Efficient Infrared Pedestrian Semantic Segmentation Method
Liu, Shangdong
Mei, Chaojun
You, Shuai
Yao, Xiaoliang
Wu, Fei
Ji, Yimu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (09) : 1564 - 1571
[43] Lidar Mapping Optimization Based on Lightweight Semantic Segmentation
Zhao, Zhihao
Zhang, Wenquan
Gu, Jianfeng
Yang, Junjie
Huang, Kai
IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2019, 4 (03): : 353 - 362
[44] Lightweight semantic segmentation network for autonomous driving scenarios
Liu B.
Cai H.
Yang S.
Li H.
Wang Y.
Chen X.
Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2023, 50 (01): : 118 - 128
[45] TransRSS: Transformer-based Radar Semantic Segmentation
Zou, Hao
Xie, Zhen
Ou, Jiarong
Gao, Yutao
2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6965 - 6972
[46] Laformer: Vision Transformer for Panoramic Image Semantic Segmentation
Yuan, Zheng
Wang, Junhua
Lv, Yuxin
Wang, Ding
Fang, Yi
IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1792 - 1796
[47] A reversible transformer for LiDAR point cloud semantic segmentation
Akwensi, Perpertual Hope
Wang, Ruisheng
2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 19 - 28
[48] Video Semantic Segmentation via Sparse Temporal Transformer
Li, Jiangtong
Wang, Wentao
Chen, Junjie
Niu, Li
Si, Jianlou
Qian, Chen
Zhang, Liqing
PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 59 - 68
[49] Class-Prompting Transformer for Incremental Semantic Segmentation
Song, Zichen
Shi, Zhaofeng
Shang, Chao
Meng, Fanman
Xu, Linfeng
IEEE ACCESS, 2023, 11 : 100154 - 100164
[50] TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
Zhang, Wenqiang
Huang, Zilong
Luo, Guozhong
Chen, Tao
Wang, Xinggang
Liu, Wenyu
Yu, Gang
Shen, Chunhua
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12073 - 12083

← 1 2 3 4 5 →