Laformer: Vision Transformer for Panoramic Image Semantic Segmentation

被引:1
|
作者
Yuan, Zheng [1 ]
Wang, Junhua [3 ]
Lv, Yuxin [2 ]
Wang, Ding [2 ]
Fang, Yi [2 ]
机构
[1] Fudan Univ, Acad Engn & Technol, Shanghai 200433, Peoples R China
[2] Fudan Univ, Sch Informat Sci & Technol, Shanghai 200433, Peoples R China
[3] Fudan Univ, Inst Optoelect, Shanghai Frontiers Sci Res Base Intelligent Optoel, Shanghai 200438, Peoples R China
关键词
Deformable convolution; panoramic images; prototype adaptation; self-training; semantic segmentation;
D O I
10.1109/LSP.2023.3337716
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Recent years have seen great advances in the area of semantic segmentation. However, general methods are targeted at pinhole images and tend to underperform when directly adopted to panoramic images. And with the wide applications of panoramic cameras, it is important to develop feasible approaches to train segmentation models for their real-time applications. To address this problem, we propose a novel method using self-training and achieve comparable results on DensePASS dataset. Namely, we propose a deformable merge module tailored for panoramic images by efficiently and accurately incorporating features of different levels. We design a novel prototype adaptation term that aids the model to better learn the class-wise feature embeddings of distorted objects. Finally, we use a simple and valid evaluation method to achieve real-time and improved inference performance. All combined, we can reach 58.27% of mIoU scores on DensePASS dataset and achieve new state of the art results.
引用
收藏
页码:1792 / 1796
页数:5
相关论文
共 50 条
  • [21] COMPUTATIONALLY-EFFICIENT VISION TRANSFORMER FOR MEDICAL IMAGE SEMANTIC SEGMENTATION VIA DUAL PSEUDO-LABEL SUPERVISION
    Wang, Ziyang
    Dong, Nanqing
    Voiculescu, Irina
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1961 - 1965
  • [22] Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation
    Gu, Jiaqi
    Kwon, Hyoukjun
    Wang, Dilin
    Ye, Wei
    Li, Meng
    Chen, Yu-Hsin
    Lai, Liangzhen
    Chandra, Vikas
    Pan, David Z.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 12084 - 12093
  • [23] A Convolutional Vision Transformer for Semantic Segmentation of Side-Scan Sonar Data
    Rajani, Hayat
    Gracias, Nuno
    Garcia, Rafael
    arXiv, 2023,
  • [24] A convolutional vision transformer for semantic segmentation of side-scan sonar data
    Rajani, Hayat
    Gracias, Nuno
    Garcia, Rafael
    OCEAN ENGINEERING, 2023, 286
  • [25] Combining Swin Transformer With UNet for Remote Sensing Image Semantic Segmentation
    Fan, Lili
    Zhou, Yu
    Liu, Hongmei
    Li, Yunjie
    Cao, Dongpu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61 : 1 - 11
  • [26] Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation
    He, Xin
    Zhou, Yong
    Zhao, Jiaqi
    Zhang, Di
    Yao, Rui
    Xue, Yong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [27] Memory-Augmented Transformer for Remote Sensing Image Semantic Segmentation
    Zhao, Xin
    Guo, Jiayi
    Zhang, Yueting
    Wu, Yirong
    REMOTE SENSING, 2021, 13 (22)
  • [28] Enhancing Multiscale Representations With Transformer for Remote Sensing Image Semantic Segmentation
    Xiao, Tao
    Liu, Yikun
    Huang, Yuwen
    Li, Mingsong
    Yang, Gongping
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [29] TrSeg: Transformer for semantic segmentation
    Jin, Youngsaeng
    Han, David
    Ko, Hanseok
    PATTERN RECOGNITION LETTERS, 2021, 148 : 29 - 35
  • [30] Segmenter: Transformer for Semantic Segmentation
    Strudel, Robin
    Garcia, Ricardo
    Laptev, Ivan
    Schmid, Cordelia
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 7242 - 7252