Self-supervised Learning of Contextualized Local Visual Embeddings

被引:0
|
作者
Silva, Thalles [1 ]
Pedrini, Helio [1 ]
Rivera, Adin Ramirez [2 ]
机构
[1] Univ Estadual Campinas, Inst Comp, Campinas, SP, Brazil
[2] Univ Oslo, Dept Informat, Oslo, Norway
关键词
D O I
10.1109/ICCVW60793.2023.00025
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present Contextualized Local Visual Embeddings (CLoVE), a self-supervised convolutional-based method that learns representations suited for dense prediction tasks. CLoVE deviates from current methods and optimizes a single loss function that operates at the level of contextualized local embeddings learned from output feature maps of convolution neural network (CNN) encoders. To learn contextualized embeddings, CLoVE proposes a normalized multhead self-attention layer that combines local features from different parts of an image based on similarity. We extensively benchmark CLoVE's pre-trained representations on multiple datasets. CLoVE reaches state-of-the-art performance for CNN-based architectures in 4 dense prediction downstream tasks, including object detection, instance segmentation, keypoint detection, and dense pose estimation. Code: https://github.com/sthalles/CLoVE.
引用
收藏
页码:177 / 186
页数:10
相关论文
共 50 条
  • [1] Self-Supervised Learning for Contextualized Extractive Summarization
    Wang, Hong
    Wang, Xin
    Xiong, Wenhan
    Yu, Mo
    Guo, Xiaoxiao
    Chang, Shiyu
    Wang, William Yang
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 2221 - 2227
  • [2] VICRegL: Self-Supervised Learning of Local Visual Features
    Bardes, Adrien
    Ponce, Jean
    LeCun, Yann
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 35 (NEURIPS 2022), 2022,
  • [3] Self-supervised learning of class embeddings from video
    Wiles, Olivia
    Koepke, A. Sophia
    Zisserman, Andrew
    2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS (ICCVW), 2019, : 3019 - 3027
  • [4] Evaluating Self-Supervised Learning for Molecular Graph Embeddings
    Wang, Hanchen
    Kaddour, Jean
    Liu, Shengchao
    Tang, Jian
    Lasenby, Joan
    Liu, Qi
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [5] Self-supervised speaker embeddings
    Stafylakis, Themos
    Rohdin, Johan
    Plchot, Oldrich
    Mizera, Petr
    Burget, Lukas
    INTERSPEECH 2019, 2019, : 2863 - 2867
  • [6] Temporally Coherent Embeddings for Self-Supervised Video Representation Learning
    Knights, Joshua
    Harwood, Ben
    Ward, Daniel
    Vanderkop, Anthony
    Mackenzie-Ross, Olivia
    Moghadam, Peyman
    2020 25TH INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2021, : 8914 - 8921
  • [7] Temporally coherent embeddings for self-supervised video representation learning
    CSIRO-Data61, Brisbane
    QLD
    4069, Australia
    不详
    QLD
    4000, Australia
    不详
    QLD
    4072, Australia
    arXiv,
  • [8] Self-Supervised Dense Visual Representation Learning
    Ozcelik, Timoteos Onur
    Gokberk, Berk
    Akarun, Lale
    32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
  • [9] Self-supervised Learning of Visual Graph Matching
    Liu, Chang
    Zhang, Shaofeng
    Yang, Xiaokang
    Yan, Junchi
    COMPUTER VISION, ECCV 2022, PT XXIII, 2022, 13683 : 370 - 388
  • [10] Revisiting Self-Supervised Visual Representation Learning
    Kolesnikov, Alexander
    Zhai, Xiaohua
    Beyer, Lucas
    2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 1920 - 1929