Fine-Grained Recognition With Learnable Semantic Data Augmentation

被引:8
|
作者
Pu, Yifan [1 ]
Han, Yizeng [1 ]
Wang, Yulin [1 ]
Feng, Junlan [2 ]
Deng, Chao [2 ]
Huang, Gao [1 ]
机构
[1] Tsinghua Univ, Dept Automat, BNRist, Beijing 100084, Peoples R China
[2] China Mobile Res Inst, Beijing 100053, Peoples R China
关键词
Fine-grained recognition; data augmentation; meta-learning; deep learning; CLASSIFICATION; IMAGE;
D O I
10.1109/TIP.2024.3364500
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Fine-grained image recognition is a longstanding computer vision challenge that focuses on differentiating objects belonging to multiple subordinate categories within the same meta-category. Since images belonging to the same meta-category usually share similar visual appearances, mining discriminative visual cues is the key to distinguishing fine-grained categories. Although commonly used image-level data augmentation techniques have achieved great success in generic image classification problems, they are rarely applied in fine-grained scenarios, because their random editing-region behavior is prone to destroy the discriminative visual cues residing in the subtle regions. In this paper, we propose diversifying the training data at the feature-level to alleviate the discriminative region loss problem. Specifically, we produce diversified augmented samples by translating image features along semantically meaningful directions. The semantic directions are estimated with a covariance prediction network, which predicts a sample-wise covariance matrix to adapt to the large intra-class variation inherent in fine-grained images. Furthermore, the covariance prediction network is jointly optimized with the classification network in a meta-learning manner to alleviate the degenerate solution problem. Experiments on four competitive fine-grained recognition benchmarks (CUB-200-2011, Stanford Cars, FGVC Aircrafts, NABirds) demonstrate that our method significantly improves the generalization performance on several popular classification networks (e.g., ResNets, DenseNets, EfficientNets, RegNets and ViT). Combined with a recently proposed method, our semantic data augmentation approach achieves state-of-the-art performance on the CUB-200-2011 dataset. Source code is available at https://github.com/LeapLabTHU/LearnableISDA.
引用
收藏
页码:3130 / 3144
页数:15
相关论文
共 50 条
  • [1] Attribute Mix: Semantic Data Augmentation for Fine Grained Recognition
    Li, Hao
    Zhang, Xiaopeng
    Tian, Qi
    Xiong, Hongkai
    2020 IEEE INTERNATIONAL CONFERENCE ON VISUAL COMMUNICATIONS AND IMAGE PROCESSING (VCIP), 2020, : 243 - 246
  • [2] Semantic bilinear pooling for fine-grained recognition
    School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing, China
    Proc. Int. Conf. Pattern Recognit., (3660-3666):
  • [3] Semantic Bilinear Pooling for Fine-Grained Recognition
    Li, Xinjie
    Yang, Chun
    Chen, Song-Lu
    Zhu, Chao
    Yin, Xu-Cheng
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 3660 - 3666
  • [4] Commonsense Oriented Fine-Grained Data Augmentation
    Li, Huachao
    Kang, Bin
    Wang, Lei
    Computer Engineering and Applications, 2024, 60 (06) : 214 - 221
  • [5] Fine-grained Automatic Augmentation for handwritten character recognition
    Chen, Wei
    Su, Xiangdong
    Hou, Hongxu
    PATTERN RECOGNITION, 2025, 159
  • [6] Fine-Grained Crowdsourcing for Fine-Grained Recognition
    Jia Deng
    Krause, Jonathan
    Li Fei-Fei
    2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, : 580 - 587
  • [7] Weakly Supervised Semantic and Attentive Data Mixing Augmentation for Fine-Grained Visual Categorization
    He, Mengqi
    Cheng, Qilong
    Qi, Guanqiu
    IEEE ACCESS, 2022, 10 : 35814 - 35823
  • [8] Semantic Clustering for Robust Fine-Grained Scene Recognition
    George, Marian
    Dixit, Mandar
    Zogg, Gabor
    Vasconcelos, Nuno
    COMPUTER VISION - ECCV 2016, PT I, 2016, 9905 : 783 - 798
  • [9] Semantic interaction learning for fine-grained vehicle recognition
    Zhang, Jingjing
    Lei, Jingsheng
    Yang, Shengying
    Yang, Xinqi
    COMPUTER ANIMATION AND VIRTUAL WORLDS, 2022, 33 (01)
  • [10] Discriminative semantic region selection for fine-grained recognition
    Zhang, Chunjie
    Wang, Da-Han
    Li, Haisheng
    JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION, 2021, 77