Transformer-Based Decoder Designs for Semantic Segmentation on Remotely Sensed Images

被引:42
|
作者
Panboonyuen, Teerapong [1 ]
Jitkajornwanich, Kulsawasd [2 ]
Lawawirojwong, Siam [3 ]
Srestasathiern, Panu [3 ]
Vateekul, Peerapon [1 ]
机构
[1] Chulalongkorn Univ, Fac Engn, Dept Comp Engn, Phayathai Rd, Bangkok 10330, Thailand
[2] King Mongkuts Inst Technol Ladkrabang, Dept Comp Sci, Data Sci & Computat Intelligence DSCI Lab, Chalongkrung Rd, Bangkok 10520, Thailand
[3] Geoinformat & Space Technol Dev Agcy Publ Org, 120 Govt Complex,Chaeng Wattana Rd, Bangkok 10210, Thailand
关键词
vision transformer; fully transformer networks; convolutional neural network; feature pyramid network; high-resolution representations; ISPRS Vaihingen; Landsat-8;
D O I
10.3390/rs13245100
中图分类号
X [环境科学、安全科学];
学科分类号
08 ; 0830 ;
摘要
Transformers have demonstrated remarkable accomplishments in several natural language processing (NLP) tasks as well as image processing tasks. Herein, we present a deep-learning (DL) model that is capable of improving the semantic segmentation network in two ways. First, utilizing the pre-training Swin Transformer (SwinTF) under Vision Transformer (ViT) as a backbone, the model weights downstream tasks by joining task layers upon the pretrained encoder. Secondly, decoder designs are applied to our DL network with three decoder designs, U-Net, pyramid scene parsing (PSP) network, and feature pyramid network (FPN), to perform pixel-level segmentation. The results are compared with other image labeling state of the art (SOTA) methods, such as global convolutional network (GCN) and ViT. Extensive experiments show that our Swin Transformer (SwinTF) with decoder designs reached a new state of the art on the Thailand Isan Landsat-8 corpus (89.8% F1 score), Thailand North Landsat-8 corpus (63.12% F1 score), and competitive results on ISPRS Vaihingen. Moreover, both our best-proposed methods (SwinTF-PSP and SwinTF-FPN) even outperformed SwinTF with supervised pre-training ViT on the ImageNet-1K in the Thailand, Landsat-8, and ISPRS Vaihingen corpora.
引用
下载
收藏
页数:21
相关论文
共 50 条
  • [21] Fuzzy Ontologies for Semantic Interpretation of Remotely Sensed Images
    Khelifa, Djerriri
    Mimoun, Malki
    IMAGE AND SIGNAL PROCESSING FOR REMOTE SENSING XXI, 2015, 9643
  • [22] Significance of texture features in the segmentation of remotely sensed images
    Usha, S. Gandhimathi Alias
    Vasuki, S.
    OPTIK, 2022, 249
  • [23] Level set segmentation of remotely sensed hyperspectral images
    Ball, JE
    Bruce, LM
    IGARSS 2005: IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, PROCEEDINGS, 2005, : 5638 - 5642
  • [24] Significance of texture features in the segmentation of remotely sensed images
    Usha, S. Gandhimathi Alias
    Vasuki, S.
    Optik, 2022, 249
  • [25] Transformer-based automated segmentation of recycling materials for semantic understanding in construction
    Wang, Xin
    Han, Wei
    Mo, Sicheng
    Cai, Ting
    Gong, Yijing
    Li, Yin
    Zhu, Zhenhua
    AUTOMATION IN CONSTRUCTION, 2023, 154
  • [26] A novel transformer-based semantic segmentation framework for structural condition assessment
    Wang, Ruhua
    Shao, Yanda
    Li, Qilin
    Li, Ling
    Li, Jun
    Hao, Hong
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2024, 23 (02): : 1170 - 1183
  • [27] Bispace Domain Adaptation Network for Remotely Sensed Semantic Segmentation
    Liu, Wei
    Su, Fulin
    Jin, Xinfei
    Li, Hongxu
    Qin, Rongjun
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [28] Watershed Segmentation of Remotely Sensed Images Based on a Supervised Fuzzy Pixel Classification
    Derivaux, Sebastien
    Lefevre, Sebastien
    Wemmert, Cedric
    Korczak, Jerzy J.
    2006 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, VOLS 1-8, 2006, : 3712 - 3715
  • [29] A Region-Based GeneSIS Segmentation Algorithm for the Classification of Remotely Sensed Images
    Mylonas, Stelios K.
    Stavrakoudis, Dimitris G.
    Theocharis, John B.
    Mastorocostas, Paris A.
    REMOTE SENSING, 2015, 7 (03): : 2474 - 2508
  • [30] U-net based MRA framework for segmentation of remotely sensed images
    Ranjan, Pranjal
    Patil, Sarvesh
    Ansari, Rizwan Ahmed
    2020 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND SIGNAL PROCESSING (AISP), 2020,