Self-supervised Learning of Contextualized Local Visual Embeddings

被引:0
|
作者
Silva, Thalles [1 ]
Pedrini, Helio [1 ]
Rivera, Adin Ramirez [2 ]
机构
[1] Univ Estadual Campinas, Inst Comp, Campinas, SP, Brazil
[2] Univ Oslo, Dept Informat, Oslo, Norway
关键词
D O I
10.1109/ICCVW60793.2023.00025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Contextualized Local Visual Embeddings (CLoVE), a self-supervised convolutional-based method that learns representations suited for dense prediction tasks. CLoVE deviates from current methods and optimizes a single loss function that operates at the level of contextualized local embeddings learned from output feature maps of convolution neural network (CNN) encoders. To learn contextualized embeddings, CLoVE proposes a normalized multhead self-attention layer that combines local features from different parts of an image based on similarity. We extensively benchmark CLoVE's pre-trained representations on multiple datasets. CLoVE reaches state-of-the-art performance for CNN-based architectures in 4 dense prediction downstream tasks, including object detection, instance segmentation, keypoint detection, and dense pose estimation. Code: https://github.com/sthalles/CLoVE.
引用
收藏
页码:177 / 186
页数:10
相关论文
共 50 条
  • [21] A Survey on Masked Autoencoder for Visual Self-supervised Learning
    Zhang, Chaoning
    Zhang, Chenshuang
    Song, Junha
    Yi, John Seon Keun
    Kweon, In So
    PROCEEDINGS OF THE THIRTY-SECOND INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2023, 2023, : 6805 - 6813
  • [22] Transitive Invariance for Self-supervised Visual Representation Learning
    Wang, Xiaolong
    He, Kaiming
    Gupta, Abhinav
    2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1338 - 1347
  • [23] Self-supervised Visual Representation Learning for Histopathological Images
    Yang, Pengshuai
    Hong, Zhiwei
    Yin, Xiaoxu
    Zhu, Chengzhan
    Jiang, Rui
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION - MICCAI 2021, PT II, 2021, 12902 : 47 - 57
  • [24] Self-supervised Visual Attribute Learning for Fashion Compatibility
    Kim, Donghyun
    Saito, Kuniaki
    Mishra, Samarth
    Sclaroff, Stan
    Saenko, Kate
    Plummer, Bryan A.
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW 2021), 2021, : 1057 - 1066
  • [25] Self-Supervised Visual Representation Learning with Semantic Grouping
    Wen, Xin
    Zhao, Bingchen
    Zheng, Anlin
    Zhang, Xiangyu
    Qi, Xiaojuan
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [26] Self-supervised representation learning by predicting visual permutations
    Zhao, Qilu
    Dong, Junyu
    KNOWLEDGE-BASED SYSTEMS, 2020, 210
  • [27] Self-Supervised Visual Descriptor Learning for Dense Correspondence
    Schmidt, Tanner
    Newcombe, Richard
    Fox, Dieter
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2017, 2 (02): : 420 - 427
  • [28] Boosting Self-Supervised Embeddings for Speech Enhancement
    Hung, Kuo-Hsuan
    Fu, Szu-Wei
    Tseng, Huan-Hsin
    Chiang, Hsin-Tien
    Tsao, Yu
    Lin, Chii-Wann
    INTERSPEECH 2022, 2022, : 186 - 190
  • [29] SELF-SUPERVISED LEARNING FOR AUDIO-VISUAL SPEAKER DIARIZATION
    Ding, Yifan
    Xu, Yong
    Zhang, Shi-Xiong
    Cong, Yahuan
    Wang, Liqiang
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 4367 - 4371
  • [30] Sequential Adversarial Learning for Self-Supervised Deep Visual Odometry
    Li, Shunkai
    Xue, Fei
    Wang, Xin
    Yan, Zike
    Zha, Hongbin
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 2851 - 2860