Light Self-Gaussian-Attention Vision Transformer for Hyperspectral Image Classification

被引:13
|
作者
Ma, Chao [1 ,2 ]
Wan, Minjie [1 ,2 ]
Wu, Jian [3 ]
Kong, Xiaofang [4 ]
Shao, Ajun [1 ,2 ]
Wang, Fan [1 ,2 ]
Chen, Qian [1 ,2 ]
Gu, Guohua [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China
[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Spectral Imaging & Intelligent Sen, Nanjing 210094, Peoples R China
[3] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China
[4] Nanjing Univ Sci & Technol, Natl Key Lab Transient Phys, Nanjing 210094, Peoples R China
关键词
Feature extraction; Transformers; Principal component analysis; Computational modeling; Task analysis; Data mining; Correlation; Gaussian position module; hybrid spatial-spectral tokenizer; hyperspectral image (HSI) classification; light self-Gaussian attention (LSGA); location-aware long-distance modeling; NETWORK;
D O I
10.1109/TIM.2023.3279922
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification because of their exceptional performance in local feature extraction. However, due to the local join and weight sharing properties of the convolution kernel, CNNs have limitations in long-distance modeling, and deeper networks tend to increase computational costs. To address these issues, this article proposes a vision Transformer (VIT) based on the light self-Gaussian-attention (LSGA) mechanism, which extracts global deep semantic features. First, the hybrid spatial-spectral tokenizer module extracts shallow spatial-spectral features and expands image patches to generate tokens. Next, the light self-attention uses Q (query), X (origin input), and X instead of Q, K (key), and V (value) to reduce the computation and parameters. Furthermore, to avoid the lack of location information resulting in the aliasing of central and neighborhood features, we devise Gaussian absolute position bias to simulate HSI data distribution and make the attention weight closer to the central query block. Several experiments verify the effectiveness of the proposed method, which outperforms state-of-the-art methods on four datasets. Specifically, we observed a 0.62% accuracy improvement over A2S2K and a 0.11% improvement over SSFTT. In conclusion, the proposed LSGA-VIT method demonstrates promising results in the HSI classification and shows potential in addressing the issues of location-aware long-distance modeling and computational cost. Our codes are available at https://github.com/machao132/LSGA-VIT.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Hyperspectral Image Classification: An Analysis Employing CNN, LSTM, Transformer, and Attention Mechanism
    Viel, Felipe
    Maciel, Renato Cotrim
    Seman, Laio Oriel
    Zeferino, Cesar Albenes
    Bezerra, Eduardo Augusto
    Leithardt, Valderi Reis Quietinho
    IEEE ACCESS, 2023, 11 : 24835 - 24850
  • [22] Multiscale Neighborhood Attention Transformer With Optimized Spatial Pattern for Hyperspectral Image Classification
    Qiao, Xin
    Roy, Swalpa Kumar
    Huang, Weimin
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [23] Hyperspectral Image Classification Based on Graph Transformer Network and Graph Attention Mechanism
    Zhao, Xiaofeng
    Niu, Jiahui
    Liu, Chuntong
    Ding, Yao
    Hong, Danfeng
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19
  • [24] Spatial-Spectral Transformer With Cross-Attention for Hyperspectral Image Classification
    Peng, Yishu
    Zhang, Yuwen
    Tu, Bing
    Li, Qianming
    Li, Wujing
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [25] MATNet: A Combining Multi-Attention and Transformer Network for Hyperspectral Image Classification
    Zhang, Bo
    Chen, Yaxiong
    Rong, Yi
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [26] Spectral-Spatial Attention Transformer with Dense Connection for Hyperspectral Image Classification
    Dang, Lanxue
    Weng, Libo
    Dong, Weichuan
    Li, Shenshen
    Hou, Yane
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2022, 2022
  • [27] Hyperspectral Image Transformer Classification Networks
    Yang, Xiaofei
    Cao, Weijia
    Lu, Yao
    Zhou, Yicong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2022, 60
  • [28] SPECTRAL TRANSFORMER WITH DYNAMIC SPATIAL SAMPLING AND GAUSSIAN POSITIONAL EMBEDDING FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Feng, Jiaqi
    Luo, Xiaoyan
    Li, Sen
    Wang, Qixiong
    Yin, Jihao
    2022 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM (IGARSS 2022), 2022, : 3556 - 3559
  • [29] Refined Feature-Space Window Attention Vision Transformer for Image Classification
    Yoo D.
    Yoo J.
    Transactions of the Korean Institute of Electrical Engineers, 2024, 73 (06): : 1004 - 1011
  • [30] Multi-granularity vision transformer via semantic token for hyperspectral image classification
    Li, Bin
    Ouyang, Er
    Hu, Wenjing
    Zhang, Guoyun
    Zhao, Lin
    Wu, Jianhui
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2022, 43 (17) : 6538 - 6560