Light Self-Gaussian-Attention Vision Transformer for Hyperspectral Image Classification

被引:13
|
作者
Ma, Chao [1 ,2 ]
Wan, Minjie [1 ,2 ]
Wu, Jian [3 ]
Kong, Xiaofang [4 ]
Shao, Ajun [1 ,2 ]
Wang, Fan [1 ,2 ]
Chen, Qian [1 ,2 ]
Gu, Guohua [1 ,2 ]
机构
[1] Nanjing Univ Sci & Technol, Sch Elect & Opt Engn, Nanjing 210094, Peoples R China
[2] Nanjing Univ Sci & Technol, Jiangsu Key Lab Spectral Imaging & Intelligent Sen, Nanjing 210094, Peoples R China
[3] Southeast Univ, Sch Comp Sci & Engn, Nanjing 211189, Peoples R China
[4] Nanjing Univ Sci & Technol, Natl Key Lab Transient Phys, Nanjing 210094, Peoples R China
关键词
Feature extraction; Transformers; Principal component analysis; Computational modeling; Task analysis; Data mining; Correlation; Gaussian position module; hybrid spatial-spectral tokenizer; hyperspectral image (HSI) classification; light self-Gaussian attention (LSGA); location-aware long-distance modeling; NETWORK;
D O I
10.1109/TIM.2023.3279922
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In recent years, convolutional neural networks (CNNs) have been widely used in hyperspectral image (HSI) classification because of their exceptional performance in local feature extraction. However, due to the local join and weight sharing properties of the convolution kernel, CNNs have limitations in long-distance modeling, and deeper networks tend to increase computational costs. To address these issues, this article proposes a vision Transformer (VIT) based on the light self-Gaussian-attention (LSGA) mechanism, which extracts global deep semantic features. First, the hybrid spatial-spectral tokenizer module extracts shallow spatial-spectral features and expands image patches to generate tokens. Next, the light self-attention uses Q (query), X (origin input), and X instead of Q, K (key), and V (value) to reduce the computation and parameters. Furthermore, to avoid the lack of location information resulting in the aliasing of central and neighborhood features, we devise Gaussian absolute position bias to simulate HSI data distribution and make the attention weight closer to the central query block. Several experiments verify the effectiveness of the proposed method, which outperforms state-of-the-art methods on four datasets. Specifically, we observed a 0.62% accuracy improvement over A2S2K and a 0.11% improvement over SSFTT. In conclusion, the proposed LSGA-VIT method demonstrates promising results in the HSI classification and shows potential in addressing the issues of location-aware long-distance modeling and computational cost. Our codes are available at https://github.com/machao132/LSGA-VIT.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Hyperspectral Image Classification Based on Multi-stage Vision Transformer with Stacked Samples
    Chen, Xiaoyue
    Kamata, Sei-Ichiro
    Zhou, Weilian
    2021 IEEE REGION 10 CONFERENCE (TENCON 2021), 2021, : 441 - 446
  • [32] Bridging CNN and Transformer With Cross-Attention Fusion Network for Hyperspectral Image Classification
    Xu, Fulin
    Mei, Shaohui
    Zhang, Ge
    Wang, Nan
    Du, Qian
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62
  • [33] DCTN: Dual-Branch Convolutional Transformer Network With Efficient Interactive Self-Attention for Hyperspectral Image Classification
    Zhou, Yunfei
    Huang, Xiaohui
    Yang, Xiaofei
    Peng, Jiangtao
    Ban, Yifang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 16
  • [34] Weighted residual self-attention graph-based transformer for spectral-spatial hyperspectral image classification
    Zu, Baokai
    Wang, Hongyuan
    Li, Jianqiang
    He, Ziping
    Li, Yafang
    Yin, Zhixian
    INTERNATIONAL JOURNAL OF REMOTE SENSING, 2023, 44 (03) : 852 - 877
  • [35] SANet: A Self-Attention Network for Agricultural Hyperspectral Image Classification
    Zhang, Bo
    Chen, Yaxiong
    Li, Zhiheng
    Xiong, Shengwu
    Lu, Xiaoqiang
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2024, 62 : 1 - 15
  • [36] LIGHT-WEIGHTED EXPLAINABLE DUAL TRANSFORMER NETWORK FOR HYPERSPECTRAL IMAGE CLASSIFICATION
    Xu, Linlin
    Fang, Yuan
    Chen, Xinwei
    Clausi, David A.
    IGARSS 2023 - 2023 IEEE INTERNATIONAL GEOSCIENCE AND REMOTE SENSING SYMPOSIUM, 2023, : 5942 - 5945
  • [37] The Application of Vision Transformer in Image Classification
    He, Zhixuan
    2022 THE 6TH INTERNATIONAL CONFERENCE ON VIRTUAL AND AUGMENTED REALITY SIMULATIONS, ICVARS 2022, 2022, : 56 - 63
  • [38] Dictionary cache transformer for hyperspectral image classification
    Heng Zhou
    Xin Zhang
    Chunlei Zhang
    Qiaoyu Ma
    Yanan Jiang
    Applied Intelligence, 2023, 53 : 26725 - 26749
  • [39] A Lightweight Transformer Network for Hyperspectral Image Classification
    Zhang, Xuming
    Su, Yuanchao
    Gao, Lianru
    Bruzzone, Lorenzo
    Gu, Xingfa
    Tian, Qingjiu
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [40] Convolutional Transformer Network for Hyperspectral Image Classification
    Zhao, Zhengang
    Hu, Dan
    Wang, Hao
    Yu, Xianchuan
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2022, 19