EMCRL: EM-Enhanced Negative Sampling Strategy for Contrastive Representation Learning

被引:0
|
作者
Zhang, Kun [1 ]
Lv, Guangyi [2 ]
Wu, Le [1 ]
Hong, Richang [1 ]
Wang, Meng [1 ]
机构
[1] Hefei Univ Technol, Sch Comp & Informat, Hefei 230029, Anhui, Peoples R China
[2] Lenovo Res, AI Lab, Beijing 100094, Peoples R China
基金
中国国家自然科学基金;
关键词
Representation learning; Data augmentation; Data models; Semantics; Optimization; Estimation; Sampling methods; Robustness; Natural languages; Crops; Contrastive learning (CL); expectation maximization (EM); negative examples; representation learning;
D O I
10.1109/TCSS.2024.3454056
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As one representative framework of self-supervised learning (SSL), contrastive learning (CL) has drawn enormous attention in the representation learning area. By pulling together a "positive" example and an anchor, as well as pushing away many "negative" examples from the anchor, CL is able to generate high-quality representations for the data of different modalities. Therefore, the qualities of selected positive and negative examples are critical for the performance of CL-based models. However, due to the assumption of label unavailability, most existing work follows the paradigm of contrastive instance discrimination, which treats each input instance as an individual category. Therefore, they focused more on positive example generation and designed plenty of data augmentation strategies. For negative examples, they just leverage the in-batch negative sampling strategy. We argue that this negative sampling strategy will easily select false negatives and inhibit the capability of CL, which we also believe is one of the reasons why a large size of negatives is needed in CL. Apart from using annotated labels, we try to tackle this problem in an unsupervised manner. We propose to integrate expectation maximization (EM) into the selection of negative examples and develop a novel EM-enhanced negative sampling strategy (EMCRL) to distinguish false negatives from true ones for CL performance improvement. Specifically, EMCRL employs EM to estimate the distribution of ground-truth relations between each sample and corresponding in-batch negatives and then optimizes model parameters with the estimations. Considering the sensitivity of EM algorithm to the parameter initialization, we propose to add a random flip into the distribution estimation to enhance the robustness of the learning process. Extensive experiments over several advanced models on sentence representation and image representation tasks demonstrate the effectiveness of EMCRL. Our method is easy to implement, and the code is publicly available at https://github.com/zhangkunzk/EMCRL_pytorch.
引用
收藏
页数:12
相关论文
共 50 条
  • [1] Contrastive Speaker Representation Learning with Hard Negative Sampling for Speaker Recognition
    Go, Changhwan
    Lee, Young Han
    Kim, Taewoo
    Park, Nam In
    Chun, Chanjun
    SENSORS, 2024, 24 (19)
  • [2] Event representation via contrastive learning with prototype based hard negative sampling
    Kong, Jing
    Yang, Zhouwang
    NEUROCOMPUTING, 2024, 600
  • [3] Hard Negative Sampling via Regularized Optimal Transport for Contrastive Representation Learning
    Jiang, Ruijie
    Ishwar, Prakash
    Aeron, Shuchin
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [4] SECL: Sampling enhanced contrastive learning
    Tang, Yixin
    Cheng, Hua
    Fang, Yiquan
    Cheng, Tao
    AI COMMUNICATIONS, 2023, 36 (01) : 1 - 12
  • [5] RevGNN: Negative Sampling Enhanced Contrastive Graph Learning for Academic Reviewer Recommendation
    Liao, Weibin
    Zhi, Yifan
    Zhang, Qi
    Ou, Zhonghong
    Li, Xuesong
    ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2024, 43 (01)
  • [6] ConCur: Self-supervised graph representation based on contrastive learning with curriculum negative sampling
    Yan, Rong
    Bao, Peng
    NEUROCOMPUTING, 2023, 551
  • [7] Predicting microbe-drug associations with structure-enhanced contrastive learning and self-paced negative sampling strategy
    Tian, Zhen
    Yu, Yue
    Fang, Haichuan
    Xie, Weixin
    Guo, Maozu
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (02)
  • [8] Difficulty-based Sampling for Debiased Contrastive Representation Learning
    Jang, Taeuk
    Wang, Xiaoqian
    2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 24039 - 24048
  • [9] Clustering Enhanced Multiplex Graph Contrastive Representation Learning
    Yuan, Ruiwen
    Tang, Yongqiang
    Wu, Yajing
    Zhang, Wensheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2025, 36 (01) : 1341 - 1355
  • [10] Negative samples selecting strategy for graph contrastive learning
    Miao, Rui
    Yang, Yintao
    Ma, Yao
    Juan, Xin
    Xue, Haotian
    Tang, Jiliang
    Wang, Ying
    Wang, Xin
    INFORMATION SCIENCES, 2022, 613 : 667 - 681