Transformer-Based Fused Attention Combined with CNNs for Image Classification

被引:0
|
作者
Jielin Jiang
Hongxiang Xu
Xiaolong Xu
Yan Cui
Jintao Wu
机构
[1] Nanjing University of Information Science and Technology,School of Software
[2] Nanjing University of Information Science and Technology,Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET)
[3] Nanjing Normal University of Special Education,College of Mathematics and Information Science
来源
Neural Processing Letters | 2023年 / 55卷
关键词
Image classification; Swin transformer; Fusion attention; Residual convolution;
D O I
暂无
中图分类号
学科分类号
摘要
The receptive field of convolutional neural networks (CNNs) is focused on the local context, while the transformer receptive field is concerned with the global context. Transformers are the new backbone of computer vision due to their powerful ability to extract global features, which is supported by pre-training on extensive amounts of data. However, it is challenging to collect a large number of high-quality labeled images for the pre-training phase. Therefore, this paper proposes a classification network (CofaNet) that combines CNNs and transformer-based fused attention to address the limitations of transformers without pre-training, such as low accuracy. CofaNet introduces patch sequence dimension attention to capture the relationship among subsequences and incorporates it into self-attention to construct a new attention feature extraction layer. Then, a residual convolution block is used instead of multi-layer perception after the fusion attention layer to compensate for the limited feature extraction of the attention layer on small datasets. The experimental results on three benchmark datasets demonstrate that CofaNet achieves excellent classification accuracy when compared to some transformer-based networks without pre-traning.
引用
收藏
页码:11905 / 11919
页数:14
相关论文
共 50 条
  • [1] Transformer-Based Fused Attention Combined with CNNs for Image Classification
    Jiang, Jielin
    Xu, Hongxiang
    Xu, Xiaolong
    Cui, Yan
    Wu, Jintao
    [J]. NEURAL PROCESSING LETTERS, 2023, 55 (09) : 11905 - 11919
  • [2] A Novel Transformer-Based Attention Network for Image Dehazing
    Gao, Guanlei
    Cao, Jie
    Bao, Chun
    Hao, Qun
    Ma, Aoqi
    Li, Gang
    [J]. SENSORS, 2022, 22 (09)
  • [3] ScoreNet: Learning Non-Uniform Attention and Augmentation for Transformer-Based Histopathological Image Classification
    Stegmuller, Thomas
    Bozorgtabar, Behzad
    Spahr, Antoine
    Thiran, Jean-Philippe
    [J]. 2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 6159 - 6168
  • [4] Image captioning using transformer-based double attention network
    Parvin, Hashem
    Naghsh-Nilchi, Ahmad Reza
    Mohammadi, Hossein Mahvash
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 125
  • [5] TripleFormer: improving transformer-based image classification method using multiple self-attention inputs
    Gong, Yu
    Wu, Peng
    Xu, Renjie
    Zhang, Xiaoming
    Wang, Tao
    Li, Xuan
    [J]. VISUAL COMPUTER, 2024,
  • [6] Vision Transformer-Based Ensemble Learning for Hyperspectral Image Classification
    Liu, Jun
    Guo, Haoran
    He, Yile
    Li, Huali
    [J]. REMOTE SENSING, 2023, 15 (21)
  • [7] Transformer-based unsupervised contrastive learning for histopathological image classification
    Wang, Xiyue
    Yang, Sen
    Zhang, Jun
    Wang, Minghui
    Zhang, Jing
    Yang, Wei
    Huang, Junzhou
    Han, Xiao
    [J]. MEDICAL IMAGE ANALYSIS, 2022, 81
  • [8] Transformer-based Image Compression
    Lu, Ming
    Guo, Peiyao
    Shi, Huiqing
    Cao, Chuntong
    Ma, Zhan
    [J]. DCC 2022: 2022 DATA COMPRESSION CONFERENCE (DCC), 2022, : 469 - 469
  • [9] A novel convolution transformer-based network for histopathology-image classification using adaptive convolution and dynamic attention
    Mahmood, Tahir
    Wahid, Abdul
    Hong, Jin Seong
    Kim, Seung Gu
    Park, Kang Ryoung
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2024, 135
  • [10] Transformer-Based Masked Autoencoder With Contrastive Loss for Hyperspectral Image Classification
    Cao, Xianghai
    Lin, Haifeng
    Guo, Shuaixu
    Xiong, Tao
    Jiao, Licheng
    [J]. IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61