Look Closer to See Better: Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition

被引:886
|
作者
Fu, Jianlong [1 ]
Zheng, Heliang [2 ]
Mei, Tao [1 ]
机构
[1] Microsoft Res, Beijing, Peoples R China
[2] Univ Sci & Technol China, Hefei, Anhui, Peoples R China
关键词
D O I
10.1109/CVPR.2017.476
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recognizing fine-grained categories (e.g., bird species) is difficult due to the challenges of discriminative region localization and fine-grained feature learning. Existing approaches predominantly solve these challenges independently, while neglecting the fact that region detection and fine-grained feature learning are mutually correlated and thus can reinforce each other. In this paper, we propose a novel recurrent attention convolutional neural network (RA-CNN) which recursively learns discriminative region attention and region-based feature representation at multiple scales in a mutually reinforced way. The learning at each scale consists of a classification sub-network and an attention proposal sub-network (APN). The APN starts from full images, and iteratively generates region attention from coarse to fine by taking previous predictions as a reference, while a finer scale network takes as input an amplified attended region from previous scales in a recurrent way. The proposed RA-CNN is optimized by an intra-scale classification loss and an inter-scale ranking loss, to mutually learn accurate region attention and fine-grained representation. RA-CNN does not need bounding box/part annotations and can be trained end-to-end. We conduct comprehensive experiments and show that RA-CNN achieves the best performance in three fine-grained tasks, with relative accuracy gains of 3.3%, 3.7%, 3.8%, on CUB Birds, Stanford Dogs and Stanford Cars, respectively.
引用
收藏
页码:4476 / 4484
页数:9
相关论文
共 50 条
  • [31] Subtler mixed attention network on fine-grained image classification
    Chao Liu
    Lei Huang
    Zhiqiang Wei
    Wenfeng Zhang
    [J]. Applied Intelligence, 2021, 51 : 7903 - 7916
  • [32] Fine-grained Vehicle Recognition Using Lightweight Convolutional Neural Network with Combined Learning Strategy
    Zhang, Qiang
    Zhuo, Li
    Zhang, Shiyu
    Li, Jiafeng
    Zhang, Hui
    Li, Xiaoguang
    [J]. 2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [33] Fine-grained Cars Recognition using Deep Convolutional Neural Networks
    Oliveira, Franklin
    Macena, Arianne
    Kamel, Otavio
    Souza, Wesley
    Freitas, Nicksson
    Vinuto, Tiago
    [J]. 2022 35TH SIBGRAPI CONFERENCE ON GRAPHICS, PATTERNS AND IMAGES (SIBGRAPI 2022), 2022, : 240 - 245
  • [34] Refining deep convolutional features for improving fine-grained image recognition
    Weixia Zhang
    Jia Yan
    Wenxuan Shi
    Tianpeng Feng
    Dexiang Deng
    [J]. EURASIP Journal on Image and Video Processing, 2017
  • [35] w Bilinear Convolutional Neural Networks for Fine-Grained Visual Recognition
    Lin, Tsung-Yu
    RoyChowdhury, Aruni
    Maji, Subhransu
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (06) : 1309 - 1322
  • [36] Refining deep convolutional features for improving fine-grained image recognition
    Zhang, Weixia
    Yan, Jia
    Shi, Wenxuan
    Feng, Tianpeng
    Deng, Dexiang
    [J]. EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2017,
  • [37] INCREASINGLY SPECIALIZED ENSEMBLE OF CONVOLUTIONAL NEURAL NETWORKS FOR FINE-GRAINED RECOGNITION
    Simonelli, Andrea
    Messelodi, Stefano
    De Natale, Francesco
    Bulo, Samuel Rota
    [J]. 2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 594 - 598
  • [38] Neural Prototype Trees for Interpretable Fine-grained Image Recognition
    Nauta, Meike
    van Bree, Ron
    Seifert, Christin
    [J]. 2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021, 2021, : 14928 - 14938
  • [39] MODELLING LOCAL DEEP CONVOLUTIONAL NEURAL NETWORK FEATURES TO IMPROVE FINE-GRAINED IMAGE CLASSIFICATION
    Ge, ZongYuan
    McCool, Chris
    Sanderson, Conrad
    Corke, Peter
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2015, : 4112 - 4116
  • [40] Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-grained Image Recognition
    Zheng, Heliang
    Fu, Jianlong
    Zha, Zheng-Jun
    Luo, Jiebo
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 5007 - 5016