A 3-D-Swin Transformer-Based Hierarchical Contrastive Learning Method for Hyperspectral Image Classification

被引:46
|
作者
Huang, Xin [1 ,2 ]
Dong, Mengjie [1 ]
Li, Jiayi [1 ]
Guo, Xian [3 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Peoples R China
[3] Beijing Univ Civil Engn & Architecture, Sch Geomat & Urban Spatial Informat, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Hyperspectral imaging; Learning systems; Semantics; Current transformers; Three-dimensional displays; Task analysis; Contrastive learning; hyperspectral image (HSI) classification; self-supervised learning (SSL); Swin Transformer (SwinT); Transformer; REPRESENTATION; NETWORK;
D O I
10.1109/TGRS.2022.3202036
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Deep convolutional neural networks have been dominating in the field of hyperspectral image (HSI) classification. However, single convolutional kernel can limit the receptive field and fail to capture the sequential properties of data. The self-attention-based Transformer can build global sequence information, among which the Swin Transformer (SwinT) integrates sequence modeling capability and prior information of the visual signals (e.g., locality and translation invariance). Based on SwinT, we propose a 3-D SwinT (3DSwinT) to accommodate the 3-D properties of HSI and capture the rich spatial-spectral information of HSI. Currently, supervised learning is still the most commonly used method for remote sensing image interpretation. However, pixel-by-pixel HSI classification demands a large number of high-quality labeled samples that are time-consuming and costly to collect. As unsupervised learning, self-supervised learning (SSL), especially contrastive learning, can learn semantic representations from unlabeled data and, hence, is becoming a potential alternative to supervised learning. On the other hand, current contrastive learning methods are all single level or single scale, which do not consider complex and variable multiscale features of objects. Therefore, this article proposes a novel 3DSwinT-based hierarchical contrastive learning (3DSwinT-HCL) method, which can fully exploit multiscale semantic representations of images. Besides, we propose a multiscale local contrastive learning (MS-LCL) module to mine the pixel-level representations in order to adapt to downstream dense prediction tasks. A series of experiments verify the great potential and superiority of 3DSwinT-HCL.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] Contrastive Learning Based on Transformer for Hyperspectral Image Classification
    Hu, Xiang
    Li, Teng
    Zhou, Tong
    Liu, Yu
    Peng, Yuanxi
    [J]. APPLIED SCIENCES-BASEL, 2021, 11 (18):
  • [2] Transformer-Based Masked Autoencoder With Contrastive Loss for Hyperspectral Image Classification
    Cao, Xianghai
    Lin, Haifeng
    Guo, Shuaixu
    Xiong, Tao
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [3] Transformer-based unsupervised contrastive learning for histopathological image classification
    Wang, Xiyue
    Yang, Sen
    Zhang, Jun
    Wang, Minghui
    Zhang, Jing
    Yang, Wei
    Huang, Junzhou
    Han, Xiao
    [J]. MEDICAL IMAGE ANALYSIS, 2022, 81
  • [4] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    [J]. REMOTE SENSING, 2023, 15 (21)
  • [5] Vision Transformer With Contrastive Learning for Hyperspectral Image Classification
    Zhou, Heng
    Zhang, Xin
    Zhang, Chunlei
    Ma, Qiaoyu
    [J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2023, 20
  • [6] Hyperspectral image classification method based on hierarchical transformer network
    Zhang Y.
    Zheng X.
    Lu X.
    [J]. Cehui Xuebao/Acta Geodaetica et Cartographica Sinica, 2023, 52 (07): : 1139 - 1147
  • [7] Swin transformer with multiscale 3D atrous convolution for hyperspectral image classification
    Farooque, Ghulam
    Liu, Qichao
    Sargano, Allah Bux
    Xiao, Liang
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126
  • [8] An efficient swin transformer-based method for underwater image enhancement
    Wang, Rong
    Zhang, Yonghui
    Zhang, Jian
    [J]. MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (12) : 18691 - 18708
  • [9] An efficient swin transformer-based method for underwater image enhancement
    Rong Wang
    Yonghui Zhang
    Jian Zhang
    [J]. Multimedia Tools and Applications, 2023, 82 : 18691 - 18708
  • [10] Spectral Swin Transformer Network for Hyperspectral Image Classification
    Liu, Baisen
    Liu, Yuanjia
    Zhang, Wulin
    Tian, Yiran
    Kong, Weili
    [J]. REMOTE SENSING, 2023, 15 (15)