SupCL-Seq: Supervised Contrastive Learning for Downstream Optimized Sequence Representations

被引:0
|
作者
Sedghamiz, Hooman [1 ]
Raval, Shivam [1 ]
Santus, Enrico [1 ]
Alhanai, Tuka [2 ]
Ghassemi, Mohammad [3 ]
机构
[1] Bayer Pharmaceut, DSIG, Whippany, NJ 07981 USA
[2] New York Univ, Abu Dhabi, U Arab Emirates
[3] Michigan State Univ, E Lansing, MI 48824 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While contrastive learning is proven to be an effective training strategy in computer vision, Natural Language Processing (NLP) is only recently adopting it as a self-supervised alternative to Masked Language Modeling (MLM) for improving sequence representations. This paper introduces SupCL-Seq, which extends the supervised contrastive learning from computer vision to the optimization of sequence representations in NLP. By altering the dropout mask probability in standard Transformer architectures (e.g. BERTbase), for every representation (anchor), we generate augmented altered views. A supervised contrastive loss is then utilized to maximize the system's capability of pulling together similar samples (e.g., anchors and their altered views) and pushing apart the samples belonging to the other classes. Despite its simplicity, SupCLSeq leads to large gains in many sequence classification tasks on the GLUE benchmark compared to a standard BERTbase, including 6% absolute improvement on CoLA, 5:4% on MRPC, 4:7% on RTE and 2:6% on STSB. We also show consistent gains over selfsupervised contrastively learned representations, especially in non-semantic tasks. Finally we show that these gains are not solely due to augmentation, but rather to a downstream optimized sequence representation. Code: https://github.com/hooman650/SupCL-Seq
引用
收藏
页码:3398 / 3403
页数:6
相关论文
共 26 条
  • [21] vox2vec: A Framework for Self-supervised Contrastive Learning of Voxel-Level Representations in Medical Images
    Goncharov, Mikhail
    Soboleva, Vera
    Kurmukov, Anvar
    Pisov, Maxim
    Belyaev, Mikhail
    MEDICAL IMAGE COMPUTING AND COMPUTER ASSISTED INTERVENTION, MICCAI 2023, PT I, 2023, 14220 : 605 - 614
  • [22] CCC-WAV2VEC 2.0: CLUSTERING AIDED CROSS CONTRASTIVE SELF-SUPERVISED LEARNING OF SPEECH REPRESENTATIONS
    Lodagala, Vasista Sai
    Ghosh, Sreyan
    Umesh, S.
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 1 - 8
  • [23] Less is More: Selective reduction of CT data for self-supervised pre-training of deep learning models with contrastive learning improves downstream classification performance
    Wolf, Daniel
    Payer, Tristan
    Lisson, Catharina Silvia
    Lisson, Christoph Gerhard
    Beer, Meinrad
    Götz, Michael
    Ropinski, Timo
    Computers in Biology and Medicine, 2024, 183
  • [24] Semi-Supervised SAR Image Change Detection via Structure-Optimized Complex-Valued Graph Contrastive Learning
    Li, Haolin
    Zou, Bin
    Zhang, Lamei
    Qin, Jiang
    IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2024, 21 : 1 - 1
  • [25] Margin-aware optimized contrastive learning for enhanced self-supervised histopathological image classification (Vol 13, 2, 2025)
    Gupta, Ekta
    Gupta, Varun
    HEALTH INFORMATION SCIENCE AND SYSTEMS, 2024, 13 (01):
  • [26] Self-Supervised Generative-Contrastive Learning of Multi-Modal Euclidean Input for 3D Shape Latent Representations: A Dynamic Switching Approach
    Wu, Chengzhi
    Pfrommer, Julius
    Zhou, Mingyuan
    Beyerer, Jurgen
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8432 - 8441