Learning Common Rationale to Improve Self-Supervised Representation for Fine-Grained Visual Recognition Problems

被引:2
|
作者
Shu, Yangyang [1 ]
van den Hengel, Anton [1 ]
Liu, Lingqiao [1 ]
机构
[1] Univ Adelaide, Sch Comp Sci, Adelaide, SA, Australia
关键词
D O I
10.1109/CVPR52729.2023.01096
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised learning (SSL) strategies have demonstrated remarkable performance in various recognition tasks. However, both our preliminary investigation and recent studies suggest that they may be less effective in learning representations for fine-grained visual recognition (FGVR) since many features helpful for optimizing SSL objectives are not suitable for characterizing the subtle differences in FGVR. To overcome this issue, we propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes, dubbed as common rationales in this paper. Intuitively, common rationales tend to correspond to the discriminative patterns from the key parts of foreground objects. We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective without using any pre-trained object parts or saliency detectors, making it seamlessly to be integrated with the existing SSL process. Specifically, we fit the GradCAM with a branch with limited fitting capacity, which allows the branch to capture the common rationales and discard the less common discriminative patterns. At the test stage, the branch generates a set of spatial weights to selectively aggregate features representing an instance. Extensive experimental results on four visual tasks demonstrate that the proposed method can lead to a significant improvement in different evaluation settings.(1)
引用
收藏
页码:11392 / 11401
页数:10
相关论文
共 50 条
  • [21] Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization
    Larsson, Mans
    Stenborg, Erik
    Toft, Carl
    Hammarstrand, Lars
    Sattler, Torsten
    Kahl, Fredrik
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 31 - 41
  • [22] Mixed Autoencoder for Self-supervised Visual Representation Learning
    Chen, Kai
    Liu, Zhili
    Hong, Lanqing
    Xu, Hang
    Li, Zhenguo
    Yeung, Dit-Yan
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 22742 - 22751
  • [23] Self-supervised representation learning for surgical activity recognition
    Paysan, Daniel
    Haug, Luis
    Bajka, Michael
    Oelhafen, Markus
    Buhmann, Joachim M.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER ASSISTED RADIOLOGY AND SURGERY, 2021, 16 (11) : 2037 - 2044
  • [24] Scaling and Benchmarking Self-Supervised Visual Representation Learning
    Goyal, Priya
    Mahajan, Dhruv
    Gupta, Abhinav
    Misra, Ishan
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 6400 - 6409
  • [25] Self-Supervised Visual Representation Learning with Semantic Grouping
    Wen, Xin
    Zhao, Bingchen
    Zheng, Anlin
    Zhang, Xiangyu
    Qi, Xiaojuan
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [26] Self-supervised representation learning by predicting visual permutations
    Zhao, Qilu
    Dong, Junyu
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 210
  • [27] Self-supervised Visual Representation Learning for Histopathological Images
    Yang, Pengshuai
    Hong, Zhiwei
    Yin, Xiaoxu
    Zhu, Chengzhan
    Jiang, Rui
    [J]. MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 47 - 57
  • [28] Transitive Invariance for Self-supervised Visual Representation Learning
    Wang, Xiaolong
    He, Kaiming
    Gupta, Abhinav
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1338 - 1347
  • [29] Self-Supervised ECG Representation Learning for Emotion Recognition
    Sarkar, Pritam
    Etemad, Ali
    [J]. IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2022, 13 (03) : 1541 - 1554
  • [30] Self-supervised representation learning for surgical activity recognition
    Daniel Paysan
    Luis Haug
    Michael Bajka
    Markus Oelhafen
    Joachim M. Buhmann
    [J]. International Journal of Computer Assisted Radiology and Surgery, 2021, 16 : 2037 - 2044