Similarity contrastive estimation for image and video soft contrastive self-supervised learning

被引:0
|
作者
Julien Denize
Jaonary Rabarisoa
Astrid Orcesi
Romain Hérault
机构
[1] Université Paris-Saclay,LITIS, INSA Rouen
[2] CEA,undefined
[3] List,undefined
[4] Normandie Université,undefined
来源
关键词
Deep learning; Self-supervised learning; Contrastive; Representation;
D O I
暂无
中图分类号
学科分类号
摘要
Contrastive representation learning has proven to be an effective self-supervised learning method for images and videos. Most successful approaches are based on Noise Contrastive Estimation (NCE) and use different views of an instance as positives that should be contrasted with other instances, called negatives, that are considered as noise. However, several instances in a dataset are drawn from the same distribution and share underlying semantic information. A good data representation should contain relations between the instances, or semantic similarity and dissimilarity, that contrastive learning harms by considering all negatives as noise. To circumvent this issue, we propose a novel formulation of contrastive learning using semantic similarity between instances called Similarity Contrastive Estimation (SCE). Our training objective is a soft contrastive one that brings the positives closer and estimates a continuous distribution to push or pull negative instances based on their learned similarities. We validate empirically our approach on both image and video representation learning. We show that SCE performs competitively with the state of the art on the ImageNet linear evaluation protocol for fewer pretraining epochs and that it generalizes to several downstream image tasks. We also show that SCE reaches state-of-the-art results for pretraining video representation and that the learned representation can generalize to video downstream tasks. Source code is available here: https://github.com/juliendenize/eztorch.
引用
收藏
相关论文
共 50 条
  • [21] Slimmable Networks for Contrastive Self-supervised Learning
    Zhao, Shuai
    Zhu, Linchao
    Wang, Xiaohan
    Yang, Yi
    INTERNATIONAL JOURNAL OF COMPUTER VISION, 2024, : 1222 - 1237
  • [22] Contrastive Transformation for Self-supervised Correspondence Learning
    Wang, Ning
    Zhou, Wengang
    Li, Hougiang
    THIRTY-FIFTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, THIRTY-THIRD CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE AND THE ELEVENTH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 2021, 35 : 10174 - 10182
  • [23] Self-Supervised Contrastive Learning for Singing Voices
    Yakura, Hiromu
    Watanabe, Kento
    Goto, Masataka
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 1614 - 1623
  • [24] Stereo Depth Estimation via Self-supervised Contrastive Representation Learning
    Tukra, Samyakh
    Giannarou, Stamatia
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2022, PT VII, 2022, 13437 : 604 - 614
  • [25] Vicsgaze: a gaze estimation method using self-supervised contrastive learning
    Gu, De
    Lv, Minghao
    Liu, Jianchu
    Multimedia Systems, 2024, 30 (06)
  • [26] Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning
    Qiu, Shuang
    Wang, Lingxiao
    Bai, Chenjia
    Yang, Zhuoran
    Wang, Zhaoran
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [27] Cut-in maneuver detection with self-supervised contrastive video representation learning
    Nalcakan, Yagiz
    Bastanlar, Yalin
    SIGNAL IMAGE AND VIDEO PROCESSING, 2023, 17 (06) : 2915 - 2923
  • [28] JGCL: Joint Self-Supervised and Supervised Graph Contrastive Learning
    Akkas, Selahattin
    Azad, Ariful
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 1099 - 1105
  • [29] Cross-View Temporal Contrastive Learning for Self-Supervised Video Representation
    Wang, Lulu
    Xu, Zengmin
    Zhang, Xuelian
    Meng, Ruxing
    Lu, Tao
    Computer Engineering and Applications, 2024, 60 (18) : 158 - 166
  • [30] Attentive spatial-temporal contrastive learning for self-supervised video representation
    Yang, Xingming
    Xiong, Sixuan
    Wu, Kewei
    Shan, Dongfeng
    Xie, Zhao
    IMAGE AND VISION COMPUTING, 2023, 137