A 3-D-Swin Transformer-Based Hierarchical Contrastive Learning Method for Hyperspectral Image Classification

被引:46
|
作者
Huang, Xin [1 ,2 ]
Dong, Mengjie [1 ]
Li, Jiayi [1 ]
Guo, Xian [3 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China
[2] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Peoples R China
[3] Beijing Univ Civil Engn & Architecture, Sch Geomat & Urban Spatial Informat, Beijing 100044, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Hyperspectral imaging; Learning systems; Semantics; Current transformers; Three-dimensional displays; Task analysis; Contrastive learning; hyperspectral image (HSI) classification; self-supervised learning (SSL); Swin Transformer (SwinT); Transformer; REPRESENTATION; NETWORK;
D O I
10.1109/TGRS.2022.3202036
中图分类号
P3 [地球物理学]; P59 [地球化学];
学科分类号
0708 ; 070902 ;
摘要
Deep convolutional neural networks have been dominating in the field of hyperspectral image (HSI) classification. However, single convolutional kernel can limit the receptive field and fail to capture the sequential properties of data. The self-attention-based Transformer can build global sequence information, among which the Swin Transformer (SwinT) integrates sequence modeling capability and prior information of the visual signals (e.g., locality and translation invariance). Based on SwinT, we propose a 3-D SwinT (3DSwinT) to accommodate the 3-D properties of HSI and capture the rich spatial-spectral information of HSI. Currently, supervised learning is still the most commonly used method for remote sensing image interpretation. However, pixel-by-pixel HSI classification demands a large number of high-quality labeled samples that are time-consuming and costly to collect. As unsupervised learning, self-supervised learning (SSL), especially contrastive learning, can learn semantic representations from unlabeled data and, hence, is becoming a potential alternative to supervised learning. On the other hand, current contrastive learning methods are all single level or single scale, which do not consider complex and variable multiscale features of objects. Therefore, this article proposes a novel 3DSwinT-based hierarchical contrastive learning (3DSwinT-HCL) method, which can fully exploit multiscale semantic representations of images. Besides, we propose a multiscale local contrastive learning (MS-LCL) module to mine the pixel-level representations in order to adapt to downstream dense prediction tasks. A series of experiments verify the great potential and superiority of 3DSwinT-HCL.
引用
收藏
页数:15
相关论文
共 50 条
  • [21] Hybrid Swin Transformer-Based Classification of Gaze Target Regions
    Wu, Gongpu
    Wang, Changyuan
    Gao, Lina
    Xue, Jinna
    IEEE ACCESS, 2023, 11 : 132055 - 132067
  • [22] SpectralSWIN: a spectral-swin transformer network for hyperspectral image classification
    Ayas, Selen
    Tunc-Gormus, Esra
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (11) : 4025 - 4044
  • [23] Spectral-Spatial Masked Transformer With Supervised and Contrastive Learning for Hyperspectral Image Classification
    Huang, Lingbo
    Chen, Yushi
    He, Xin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [24] Convolutional Transformer-Based Few-Shot Learning for Cross-Domain Hyperspectral Image Classification
    Peng, Yishu
    Liu, Yaru
    Tu, Bing
    Zhang, Yuwen
    IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 1335 - 1349
  • [25] Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer
    Wang, Qingbin
    Xiong, Yuxuan
    Zhu, Hanfeng
    Mu, Xuefeng
    Zhang, Yan
    Ma, Yutao
    Computerized Medical Imaging and Graphics, 2024, 118
  • [26] Transformer-based Hierarchical Encoder for Document Classification
    Sakhrani, Harsh
    Parekh, Saloni
    Ratadiya, Pratik
    21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 852 - 858
  • [27] Swin transformer-based fork architecture for automated breast tumor classification
    Uzen, Hueseyin
    Firat, Huseyin
    Atila, Orhan
    Sengur, Abdulkadir
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 256
  • [28] Classification of hyperspectral and LiDAR data by transformer-based enhancement
    Pan, Jiechen
    Shuai, Xing
    Xu, Qing
    Dai, Mofan
    Zhang, Guoping
    Wang, Guo
    REMOTE SENSING LETTERS, 2024, 15 (10) : 1074 - 1084
  • [29] Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
    Shi, Jinsong
    Gao, Pan
    Qin, Jie
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4829 - 4837
  • [30] An Implicit Transformer-based Fusion Method for Hyperspectral and Multispectral Remote Sensing Image
    Zhu, Chunyu
    Zhang, Tinghao
    Wu, Qiong
    Li, Yachao
    Zhong, Qin
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 131