An Efficient Self-Supervised Cross-View Training For Sentence Embedding

被引:0
|
作者
Limkonchotiwat, Peerat [1 ]
Ponwitayarat, Wuttikorn [1 ]
Lowphansirikul, Lalita [1 ]
Udomcharoenchaikit, Can [1 ]
Chuangsuwanich, Ekapol [2 ]
Nutanong, Sarana [1 ]
机构
[1] VISTEC, Sch Informat Sci & Technol, Rayong, Thailand
[2] Chulalongkorn Univ, Dept Comp Engn, Bangkok, Thailand
关键词
Computational linguistics - Semantics;
D O I
10.1162/tacl_a_00620
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Self-supervised sentence representation learning is the task of constructing an embedding space for sentences without relying on human annotation efforts. One straightforward approach is to finetune a pretrained language model (PLM) with a representation learning method such as contrastive learning. While this approach achieves impressive performance on larger PLMs, the performance rapidly degrades as the number of parameters decreases. In this paper, we propose a framework called Self-supervised Cross-View Training (SCT) to narrow the performance gap between large and small PLMs. To evaluate the effectiveness of SCT, we compare it to 5 baseline and state-of-the-art competitors on seven Semantic Textual Similarity (STS) benchmarks using 5 PLMs with the number of parameters ranging from 4M to 340M. The experimental results show that STC outperforms the competitors for PLMs with less than 100M parameters in 18 of 21 cases.1
引用
收藏
页码:1572 / 1587
页数:16
相关论文
共 50 条
  • [31] Self-Supervised Sentence Polishing by Adding Engaging Modifiers
    Zhang, Zhexin
    Guan, Jian
    Cui, Xin
    Ran, Yu
    Liu, Bo
    Huang, Minlie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL-DEMO 2023, VOL 3, 2023, : 499 - 507
  • [32] CDS: Cross-Domain Self-supervised Pre-training
    Kim, Donghyun
    Saito, Kuniaki
    Oh, Tae-Hyun
    Plummer, Bryan A.
    Sclaroff, Stan
    Saenko, Kate
    2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 9103 - 9112
  • [33] DEPA: Self-Supervised Audio Embedding for Depression Detection
    Zhang, Pingyue
    Wu, Mengyue
    Dinkel, Heinrich
    Yu, Kai
    PROCEEDINGS OF THE 29TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, MM 2021, 2021, : 135 - 143
  • [34] Embedding Imputation With Self-Supervised Graph Neural Networks
    Varolgunes, Uras
    Yao, Shibo
    Ma, Yao
    Yu, Dantong
    IEEE ACCESS, 2023, 11 : 70610 - 70620
  • [35] Self-supervised learning of neighborhood embedding for longitudinal MRI
    Ouyang, Jiahong
    Zhao, Qingyu
    Adeli, Ehsan
    Zaharchuk, Greg
    Pohl, Kilian M.
    MEDICAL IMAGE ANALYSIS, 2022, 82
  • [36] SELF-SUPERVISED DISENTANGLED EMBEDDING FOR ROBUST IMAGE CLASSIFICATION
    Liu, Lanqing
    Duan, Zhenyu
    Xu, Guozheng
    Xu, Yi
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 1494 - 1498
  • [37] An Embedding-Dynamic Approach to Self-Supervised Learning
    Moon, Suhong
    Buracas, Domas
    Park, Seunghyun
    Kim, Jinkyu
    Canny, John
    2023 IEEE/CVF WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV), 2023, : 2749 - 2757
  • [38] Contrastive Self-Supervised Speaker Embedding With Sequential Disentanglement
    Tu, Youzhi
    Mak, Man-Wai
    Chien, Jen-Tzung
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2024, 32 : 2704 - 2715
  • [39] Self-Adaptive Training: Bridging Supervised and Self-Supervised Learning
    Huang, Lang
    Zhang, Chao
    Zhang, Hongyang
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2024, 46 (03) : 1362 - 1377
  • [40] DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
    Wang, Xinghao
    He, Junliang
    Wang, Pengyu
    Zhou, Yunhua
    Sun, Tianxiang
    Qiu, Xipeng
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 17, 2024, : 19180 - 19188