SSformer: A Lightweight Transformer for Semantic Segmentation

被引:11
|
作者
Shi, Wentao [1 ]
Xu, Jing [1 ]
Gao, Pan [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Comp Sci & Technol, Nanjing, Peoples R China
关键词
Image Segmentation; Transformer; Multilayer perceptron; Lightweight model;
D O I
10.1109/MMSP55362.2022.9949177
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
It is well believed that Transformer performs better in semantic segmentation compared to convolutional neural networks. Nevertheless, the original Vision Transformer [2] may lack of inductive biases of local neighborhoods and possess a high time complexity. Recently, Swin Transformer [3] sets a new record in various vision tasks by using hierarchical architecture and shifted windows while being more efficient. However, as Swin Transformer is specifically designed for image classification, it may achieve suboptimal performance on dense prediction-based segmentation task. Further, simply combing Swin Transformer with existing methods would lead to the boost of model size and parameters for the final segmentation model. In this paper, we rethink the Swin Transformer for semantic segmentation, and design a lightweight yet effective transformer model, called SSformer. In this model, considering the inherent hierarchical design of Swin Transformer, we propose a decoder to aggregate information from different layers, thus obtaining both local and global attentions. Experimental results show the proposed SSformer yields comparable mIoU performance with state-of-the-art models, while maintaining a smaller model size and lower compute. Source code and pretrained models are available at: https://github.com/shiwt03/SSformer
引用
收藏
页数:5
相关论文
共 50 条
  • [41] Lightweight Semantic Segmentation of Road Scenes for Autonomous Driving
    Li, Shunxin
    Wu, Tong
    Computer Engineering and Applications, 2023, 59 (19) : 177 - 183
  • [42] A Lightweight and Efficient Infrared Pedestrian Semantic Segmentation Method
    Liu, Shangdong
    Mei, Chaojun
    You, Shuai
    Yao, Xiaoliang
    Wu, Fei
    Ji, Yimu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2023, E106D (09) : 1564 - 1571
  • [43] Lidar Mapping Optimization Based on Lightweight Semantic Segmentation
    Zhao, Zhihao
    Zhang, Wenquan
    Gu, Jianfeng
    Yang, Junjie
    Huang, Kai
    IEEE TRANSACTIONS ON INTELLIGENT VEHICLES, 2019, 4 (03): : 353 - 362
  • [44] Lightweight semantic segmentation network for autonomous driving scenarios
    Liu B.
    Cai H.
    Yang S.
    Li H.
    Wang Y.
    Chen X.
    Xi'an Dianzi Keji Daxue Xuebao/Journal of Xidian University, 2023, 50 (01): : 118 - 128
  • [45] TransRSS: Transformer-based Radar Semantic Segmentation
    Zou, Hao
    Xie, Zhen
    Ou, Jiarong
    Gao, Yutao
    2023 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, ICRA, 2023, : 6965 - 6972
  • [46] Laformer: Vision Transformer for Panoramic Image Semantic Segmentation
    Yuan, Zheng
    Wang, Junhua
    Lv, Yuxin
    Wang, Ding
    Fang, Yi
    IEEE SIGNAL PROCESSING LETTERS, 2023, 30 : 1792 - 1796
  • [47] A reversible transformer for LiDAR point cloud semantic segmentation
    Akwensi, Perpertual Hope
    Wang, Ruisheng
    2023 20TH CONFERENCE ON ROBOTS AND VISION, CRV, 2023, : 19 - 28
  • [48] Video Semantic Segmentation via Sparse Temporal Transformer
    Li, Jiangtong
    Wang, Wentao
    Chen, Junjie
    Niu, Li
    Si, Jianlou
    Qian, Chen
    Zhang, Liqing
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 59 - 68
  • [49] Class-Prompting Transformer for Incremental Semantic Segmentation
    Song, Zichen
    Shi, Zhaofeng
    Shang, Chao
    Meng, Fanman
    Xu, Linfeng
    IEEE ACCESS, 2023, 11 : 100154 - 100164
  • [50] TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation
    Zhang, Wenqiang
    Huang, Zilong
    Luo, Guozhong
    Chen, Tao
    Wang, Xinggang
    Liu, Wenyu
    Yu, Gang
    Shen, Chunhua
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12073 - 12083