Bilinear Residual Attention Networks for Fine-Grained Image Classification

被引：4

作者：

Wang Yang ^{[1
]}

Liu Libo ^{[1
]}

机构：

[1] Ningxia Univ, Sch Informat Engn, Yinchuan 750021, Ningxia, Peoples R China

来源：

LASER & OPTOELECTRONICS PROGRESS | 2020年 / 57卷 / 12期

关键词：

image processing; fine-grained image classification; attention mechanism; residual network; channel attention; spatial attention;

D O I：

10.3788/LOP57.121011

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Fine-grained images have a highly similar appearance, and the differences arc often reflected in local regions. Extracting discriminative local features plays a key role in fine-grained classification. Attention mechanism is a common strategy to solve the problems above. Therefore, we propose an improved bilinear residual attention network based on bilinear convolutional neural network model in this paper: the feature function of the original model is replaced by deep residual network with a stronger feature extraction capability, then channel attention module and spatial attention module arc added between the residual units respectively to obtain different dimensions and richer attention features. Ablation and contrast experiments were performed on three fine-grained image datasets CUB-200-2011, Stanford Dogs, and Stanford Cars, the classification accuracy of the improved model reached 87.2%, 89.2% and 92.5%, respectively. Experimental results show that our method can achieve better classification results than the original model and other mainstream fine-grained classification algorithms.

引用

页数：10

共 27 条

[1] LEARNING LONG-TERM DEPENDENCIES WITH GRADIENT DESCENT IS DIFFICULT
BENGIO, Y
SIMARD, P
FRASCONI, P
[J]. IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (02): : 157 - 166
[2] Chatfield K, 2014, P BRIT MACH VIS C 20
[3] Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition
Fu, Jianlong
Zheng, Heliang
Mei, Tao
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 4476 - 4484
[4] Glorot X., 2010, P 13 INT C ARTIFICIA, P249, DOI DOI 10.1109/LGRS.2016.2565705
[5] He K., 2016, Proceedings of the IEEE conference on computer vision and pattern recognition, DOI DOI 10.1109/CVPR.2016.90
[6] Part-Stacked CNN for Fine-Grained Visual Categorization
Huang, Shaoli
Xu, Zhe
Tao, Dacheng
Zhang, Ya
[J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 1173 - 1182
[7] Jaderberg Max, 2015, P 28 INT C NEURAL IN, P2017, DOI DOI 10.48550/ARXIV.1506.02025
[8] Khosla A, 2011, PROCEEDINGS OF THE 1, V2
[9] An Introduction to Variational Autoencoders
Kingma, Diederik P.
Welling, Max
[J]. FOUNDATIONS AND TRENDS IN MACHINE LEARNING, 2019, 12 (04): : 4 - 89
[10] Tensor Decompositions and Applications
Kolda, Tamara G.
Bader, Brett W.
[J]. SIAM REVIEW, 2009, 51 (03) : 455 - 500

← 1 2 3 →