Bilinear Residual Attention Networks for Fine-Grained Image Classification

被引:4
|
作者
Wang Yang [1 ]
Liu Libo [1 ]
机构
[1] Ningxia Univ, Sch Informat Engn, Yinchuan 750021, Ningxia, Peoples R China
关键词
image processing; fine-grained image classification; attention mechanism; residual network; channel attention; spatial attention;
D O I
10.3788/LOP57.121011
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Fine-grained images have a highly similar appearance, and the differences arc often reflected in local regions. Extracting discriminative local features plays a key role in fine-grained classification. Attention mechanism is a common strategy to solve the problems above. Therefore, we propose an improved bilinear residual attention network based on bilinear convolutional neural network model in this paper: the feature function of the original model is replaced by deep residual network with a stronger feature extraction capability, then channel attention module and spatial attention module arc added between the residual units respectively to obtain different dimensions and richer attention features. Ablation and contrast experiments were performed on three fine-grained image datasets CUB-200-2011, Stanford Dogs, and Stanford Cars, the classification accuracy of the improved model reached 87.2%, 89.2% and 92.5%, respectively. Experimental results show that our method can achieve better classification results than the original model and other mainstream fine-grained classification algorithms.
引用
下载
收藏
页数:10
相关论文
共 27 条
  • [1] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
    BENGIO, Y
    SIMARD, P
    FRASCONI, P
    [J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
  • [2] Chatfield K, 2014, P BRIT MACH VIS C 20
  • [3] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
    Fu, Jianlong
    Zheng, Heliang
    Mei, Tao
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
  • [4] Glorot X., 2010, P 13 INT C ARTIFICIA, P249, DOI DOI 10.1109/LGRS.2016.2565705
  • [5] He K., 2016, Proceedings of the IEEE conference on computer vision and pattern recognition, DOI DOI 10.1109/CVPR.2016.90
  • [6] Part-Stacked CNN for Fine-Grained Visual Categorization
    Huang, Shaoli
    Xu, Zhe
    Tao, Dacheng
    Zhang, Ya
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1173 - 1182
  • [7] Jaderberg Max, 2015, P 28 INT C NEURAL IN, P2017, DOI DOI 10.48550/ARXIV.1506.02025
  • [8] Khosla A, 2011, PROCEEDINGS OF THE 1, V2
  • [9] An Introduction to Variational Autoencoders
    Kingma, Diederik P.
    Welling, Max
    [J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (04): : 4 - 89
  • [10] Tensor Decompositions and Applications
    Kolda, Tamara G.
    Bader, Brett W.
    [J]. SIAM REVIEW, 2009, 51 (03) : 455 - 500