Contrastive learning for unsupervised sentence embeddings using negative samples with diminished semantics

被引:0
|
作者
Yu, Zhiyi [1 ]
Li, Hong [1 ]
Feng, Jialin [1 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
来源
JOURNAL OF SUPERCOMPUTING | 2024年 / 80卷 / 04期
基金
中国国家自然科学基金;
关键词
Unsupervised sentence embedding; Contrastive learning; Mild negative sample; Feature suppression;
D O I
10.1007/s11227-023-05682-6
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Unsupervised learning has made significant progress in recent years, driven by advancements in contrastive learning. However, current methods for generating negative samples often lead to false negatives and feature suppression. In this paper, we propose a new contrastive learning method for unsupervised sentence embedding using negative samples with diminished semantics (DSCSE), which includes three optimizations to produce more robust representations with less dependence on undesired features. Firstly, we introduce semantically weakened negative samples called mild negatives by blurring the main parts of the sentence in the attention mechanism, allowing the model to learn sentence embeddings that are sensitive to semantic differences between sentences. Secondly, we leverage the mild negatives to eliminate false negative samples that can negatively impact the model and to identify hard negative samples that can improve the model's performance. This filtering process improves the quality of negative samples used for training. Finally, we introduce a novel loss function called triplet fusion loss (TFL) that considers negative samples at different levels to optimize the model's performance. TFL leverages the filtered negative samples to improve the quality of the learned sentence embeddings. Experimental results on multiple semantic text similarity tasks demonstrate that our proposed DSCSE outperforms unsupervised SimCSE by + 1.52% Spearman's correlation scores, showing its effectiveness in learning sentence embeddings.
引用
收藏
页码:5428 / 5445
页数:18
相关论文
共 50 条
  • [1] Contrastive learning for unsupervised sentence embeddings using negative samples with diminished semantics
    Zhiyi Yu
    Hong Li
    Jialin Feng
    [J]. The Journal of Supercomputing, 2024, 80 : 5428 - 5445
  • [2] Robust Contrastive Learning Using Negative Samples with Diminished Semantics
    Ge, Songwei
    Mishra, Shlok
    Wang, Haohan
    Li, Chun-Liang
    Jacobs, David
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 34 (NEURIPS 2021), 2021, 34
  • [3] Contrastive Learning for Unsupervised Sentence Embedding with False Negative Calibration
    Chiu, Chi-Min
    Lin, Ying-Jia
    Kao, Hung-Yu
    [J]. ADVANCES IN KNOWLEDGE DISCOVERY AND DATA MINING, PT III, PAKDD 2024, 2024, 14647 : 290 - 301
  • [4] Composition-contrastive Learning for Sentence Embeddings
    Chanchani, Sachin
    Huang, Ruihong
    [J]. PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1, 2023, : 15836 - 15848
  • [5] SimCSE: Simple Contrastive Learning of Sentence Embeddings
    Gao, Tianyu
    Yao, Xingcheng
    Chen, Danqi
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 6894 - 6910
  • [6] MCSE: Multimodal Contrastive Learning of Sentence Embeddings
    Zhang, Miaoran
    Mosbach, Marius
    Adelani, David Ifeoluwa
    Hedderich, Michael A.
    Klakow, Dietrich
    [J]. NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 5959 - 5969
  • [7] Importance-aware contrastive learning via semantically augmented instances for unsupervised sentence embeddings
    Xin Ma
    Hong Li
    Jiawen Shi
    Yi Zhang
    Zhigao Long
    [J]. International Journal of Machine Learning and Cybernetics, 2023, 14 : 2979 - 2990
  • [8] Importance-aware contrastive learning via semantically augmented instances for unsupervised sentence embeddings
    Ma, Xin
    Li, Hong
    Shi, Jiawen
    Zhang, Yi
    Long, Zhigao
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (09) : 2979 - 2990
  • [9] BlendCSE: Blend contrastive learnings for sentence embeddings with rich semantics and transferability
    Xu, Jiahao
    Zhanyi, Charlie Soh
    Xu, Liwen
    Chen, Lihui
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [10] Debiased Contrastive Learning of Unsupervised Sentence Representations
    Zhou, Kun
    Zhang, Beichen
    Zhao, Wayne Xin
    Wen, Ji-Rong
    [J]. PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), VOL 1: (LONG PAPERS), 2022, : 6120 - 6130