Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems

被引:2
|
作者
Shu, Yangyang [1 ]
van den Hengel, Anton [1 ]
Liu, Lingqiao [1 ]
机构
[1] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
关键词
D O I
10.1109/CVPR52729.2023.01096
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning (SSL) strategies have demonstrated remarkable performance in various recognition tasks. However, both our preliminary investigation and recent studies suggest that they may be less effective in learning representations for fine-grained visual recognition (FGVR) since many features helpful for optimizing SSL objectives are not suitable for characterizing the subtle differences in FGVR. To overcome this issue, we propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes, dubbed as common rationales in this paper. Intuitively, common rationales tend to correspond to the discriminative patterns from the key parts of foreground objects. We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective without using any pre-trained object parts or saliency detectors, making it seamlessly to be integrated with the existing SSL process. Specifically, we fit the GradCAM with a branch with limited fitting capacity, which allows the branch to capture the common rationales and discard the less common discriminative patterns. At the test stage, the branch generates a set of spatial weights to selectively aggregate features representing an instance. Extensive experimental results on four visual tasks demonstrate that the proposed method can lead to a significant improvement in different evaluation settings.(1)
引用
收藏
页码:11392 / 11401
页数:10
相关论文
共 50 条
  • [31] Fine-MVO: Toward Fine-Grained Feature Enhancement for Self-Supervised Monocular Visual Odometry in Dynamic Environments
    Wei, Wenhui
    Ping, Yang
    Li, Jiadong
    Liu, Xin
    Zhou, Yangfan
    [J]. IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2024,
  • [32] A robust self-supervised approach for fine-grained crack detection in concrete structures
    Sohaib, Muhammad
    Hasan, Md Junayed
    Shah, Mohd Asif
    Zheng, Zhonglong
    [J]. SCIENTIFIC REPORTS, 2024, 14 (01):
  • [33] FUMMER: A fine-grained self-supervised momentum distillation framework for multimodal recommendation
    Wei, Yibiao
    Xu, Yang
    Zhu, Lei
    Ma, Jingwei
    Huang, Jiangping
    [J]. INFORMATION PROCESSING & MANAGEMENT, 2024, 61 (05)
  • [34] HCL: Hierarchical Consistency Learning for Webly Supervised Fine-Grained Recognition
    Sun, Hongbo
    He, Xiangteng
    Peng, Yuxin
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 5108 - 5119
  • [35] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
    Chen, Haiyuan
    Cheng, Lianglun
    Huang, Guoheng
    Zhang, Ganghan
    Lan, Jiaying
    Yu, Zhiwen
    Pun, Chi-Man
    Ling, Wing-Kuen
    [J]. APPLIED INTELLIGENCE, 2022, 52 (13) : 15673 - 15689
  • [36] Fine-grained visual classification with multi-scale features based on self-supervised attention filtering mechanism
    Haiyuan Chen
    Lianglun Cheng
    Guoheng Huang
    Ganghan Zhang
    Jiaying Lan
    Zhiwen Yu
    Chi-Man Pun
    Wing-Kuen Ling
    [J]. Applied Intelligence, 2022, 52 : 15673 - 15689
  • [37] A weakly supervised spatial group attention network for fine-grained visual recognition
    Xie, Jiangjian
    Zhong, Yujie
    Zhang, Junguo
    Zhang, Changchun
    Schuller, Bjoern W.
    [J]. APPLIED INTELLIGENCE, 2023, 53 (20) : 23301 - 23315
  • [38] A weakly supervised spatial group attention network for fine-grained visual recognition
    Jiangjian Xie
    Yujie Zhong
    Junguo Zhang
    Changchun Zhang
    Björn W Schuller
    [J]. Applied Intelligence, 2023, 53 : 23301 - 23315
  • [39] Self-Supervised EEG Representation Learning for Robust Emotion Recognition
    Liu, Huan
    Zhang, Yuzhe
    Chen, Xuxu
    Zhang, Dalin
    Li, Rui
    Qin, Tao
    [J]. ACM Transactions on Sensor Networks, 2024, 20 (05)
  • [40] Self-Supervised Visual Representation Learning from Hierarchical Grouping
    Zhang, Xiao
    Maire, Michael
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33