A 3-D-Swin Transformer-Based Hierarchical Contrastive Learning Method for Hyperspectral Image Classification

被引：46

作者：

Huang, Xin ^{[1
,2
]}

Dong, Mengjie ^{[1
]}

Li, Jiayi ^{[1
]}

Guo, Xian ^{[3
]}

机构：

[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Peoples R China

[2] Wuhan Univ, State Key Lab Informat Engn Surveying Mapping & R, Wuhan 430079, Peoples R China

[3] Beijing Univ Civil Engn & Architecture, Sch Geomat & Urban Spatial Informat, Beijing 100044, Peoples R China

来源：

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING | 2022年 / 60卷

基金：

中国国家自然科学基金;

关键词：

Feature extraction; Hyperspectral imaging; Learning systems; Semantics; Current transformers; Three-dimensional displays; Task analysis; Contrastive learning; hyperspectral image (HSI) classification; self-supervised learning (SSL); Swin Transformer (SwinT); Transformer; REPRESENTATION; NETWORK;

D O I：

10.1109/TGRS.2022.3202036

中图分类号：

P3 [地球物理学]; P59 [地球化学];

学科分类号：

0708 ; 070902 ;

摘要：

Deep convolutional neural networks have been dominating in the field of hyperspectral image (HSI) classification. However, single convolutional kernel can limit the receptive field and fail to capture the sequential properties of data. The self-attention-based Transformer can build global sequence information, among which the Swin Transformer (SwinT) integrates sequence modeling capability and prior information of the visual signals (e.g., locality and translation invariance). Based on SwinT, we propose a 3-D SwinT (3DSwinT) to accommodate the 3-D properties of HSI and capture the rich spatial-spectral information of HSI. Currently, supervised learning is still the most commonly used method for remote sensing image interpretation. However, pixel-by-pixel HSI classification demands a large number of high-quality labeled samples that are time-consuming and costly to collect. As unsupervised learning, self-supervised learning (SSL), especially contrastive learning, can learn semantic representations from unlabeled data and, hence, is becoming a potential alternative to supervised learning. On the other hand, current contrastive learning methods are all single level or single scale, which do not consider complex and variable multiscale features of objects. Therefore, this article proposes a novel 3DSwinT-based hierarchical contrastive learning (3DSwinT-HCL) method, which can fully exploit multiscale semantic representations of images. Besides, we propose a multiscale local contrastive learning (MS-LCL) module to mine the pixel-level representations in order to adapt to downstream dense prediction tasks. A series of experiments verify the great potential and superiority of 3DSwinT-HCL.

引用

页数：15

共 50 条

[21] Hybrid Swin Transformer-Based Classification of Gaze Target Regions
Wu, Gongpu
Wang, Changyuan
Gao, Lina
Xue, Jinna
IEEE ACCESS, 2023, 11 : 132055 - 132067
[22] SpectralSWIN: a spectral-swin transformer network for hyperspectral image classification
Ayas, Selen
Tunc-Gormus, Esra
INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (11) : 4025 - 4044
[23] Spectral-Spatial Masked Transformer With Supervised and Contrastive Learning for Hyperspectral Image Classification
Huang, Lingbo
Chen, Yushi
He, Xin
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
[24] Convolutional Transformer-Based Few-Shot Learning for Cross-Domain Hyperspectral Image Classification
Peng, Yishu
Liu, Yaru
Tu, Bing
Zhang, Yuwen
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2023, 16 : 1335 - 1349
[25] Cervical OCT image classification using contrastive masked autoencoders with Swin Transformer
Wang, Qingbin
Xiong, Yuxuan
Zhu, Hanfeng
Mu, Xuefeng
Zhang, Yan
Ma, Yutao
Computerized Medical Imaging and Graphics, 2024, 118
[26] Transformer-based Hierarchical Encoder for Document Classification
Sakhrani, Harsh
Parekh, Saloni
Ratadiya, Pratik
21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 852 - 858
[27] Swin transformer-based fork architecture for automated breast tumor classification
Uzen, Hueseyin
Firat, Huseyin
Atila, Orhan
Sengur, Abdulkadir
EXPERT SYSTEMS WITH APPLICATIONS, 2024, 256
[28] Classification of hyperspectral and LiDAR data by transformer-based enhancement
Pan, Jiechen
Shuai, Xing
Xu, Qing
Dai, Mofan
Zhang, Guoping
Wang, Guo
REMOTE SENSING LETTERS, 2024, 15 (10) : 1074 - 1084
[29] Transformer-Based No-Reference Image Quality Assessment via Supervised Contrastive Learning
Shi, Jinsong
Gao, Pan
Qin, Jie
THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 5, 2024, : 4829 - 4837
[30] An Implicit Transformer-based Fusion Method for Hyperspectral and Multispectral Remote Sensing Image
Zhu, Chunyu
Zhang, Tinghao
Wu, Qiong
Li, Yachao
Zhong, Qin
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2024, 131

← 1 2 3 4 5 →