On the Sentence Embeddings from Pre-trained Language Models

被引:0
|
作者
Li, Bohan [1 ,2 ]
Zhou, Hao [1 ]
He, Junxian [2 ]
Wang, Mingxuan [1 ]
Yang, Yiming [2 ]
Li, Lei [1 ]
机构
[1] ByteDance AI Lab, Warsaw, Poland
[2] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Pre-trained contextual representations like BERT have achieved great success in natural language processing. However, the sentence embeddings from the pre-trained language models without fine-tuning have been found to poorly capture semantic meaning of sentences. In this paper, we argue that the semantic information in the BERT embeddings is not fully exploited. We first reveal the theoretical connection between the masked language model pre-training objective and the semantic similarity task theoretically, and then analyze the BERT sentence embeddings empirically. We find that BERT always induces a non-smooth anisotropic semantic space of sentences, which harms its performance of semantic similarity. To address this issue, we propose to transform the anisotropic sentence embedding distribution to a smooth and isotropic Gaussian distribution through normalizing flows that are learned with an unsupervised objective. Experimental results show that our proposed BERT-flow method obtains significant performance gains over the state-of-the-art sentence embeddings on a variety of semantic textual similarity tasks. The code is available at https://github.com/bohanli/BERT- flow.
引用
收藏
页码:9119 / 9130
页数:12
相关论文
共 50 条
  • [1] Disentangling Semantics and Syntax in Sentence Embeddings with Pre-trained Language Models
    Huang, James Y.
    Huang, Kuan-Hao
    Chang, Kai-Wei
    [J]. 2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, : 1372 - 1379
  • [2] Distilling Relation Embeddings from Pre-trained Language Models
    Ushio, Asahi
    Camacho-Collados, Jose
    Schockaert, Steven
    [J]. 2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, : 9044 - 9062
  • [3] An Empirical study on Pre-trained Embeddings and Language Models for Bot Detection
    Garcia-Silva, Andres
    Berrio, Cristian
    Manuel Gomez-Perez, Jose
    [J]. 4TH WORKSHOP ON REPRESENTATION LEARNING FOR NLP (REPL4NLP-2019), 2019, : 148 - 155
  • [4] From Word Embeddings to Pre-Trained Language Models: A State-of-the-Art Walkthrough
    Mars, Mourad
    [J]. APPLIED SCIENCES-BASEL, 2022, 12 (17):
  • [5] General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference
    Du, Jingfei
    Ott, Myle
    Li, Haoran
    Zhou, Xing
    Stoyanov, Veselin
    [J]. FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EMNLP 2020, 2020,
  • [6] Integrating Knowledge Graph Embeddings and Pre-trained Language Models in Hypercomplex Spaces
    Nayyeri, Mojtaba
    Wang, Zihao
    Akter, Mst. Mahfuja
    Alam, Mirza Mohtashim
    Rony, Md Rashad Al Hasan
    Lehmann, Jens
    Staab, Steffen
    [J]. SEMANTIC WEB, ISWC 2023, PART I, 2023, 14265 : 388 - 407
  • [7] Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation
    Li, Xiu
    Henriksson, Aron
    Duneld, Martin
    Nouri, Jalal
    Wu, Yongchao
    [J]. FUTURE INTERNET, 2024, 16 (01)
  • [8] Pre-Trained Language Models and Their Applications
    Wang, Haifeng
    Li, Jiwei
    Wu, Hua
    Hovy, Eduard
    Sun, Yu
    [J]. ENGINEERING, 2023, 25 : 51 - 65
  • [9] From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
    Xu, Weiwen
    Li, Xin
    Zhang, Wenxuan
    Zhou, Meng
    Lam, Wai
    Si, Luo
    Bing, Lidong
    [J]. ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [10] Solving ESL Sentence Completion Questions via Pre-trained Neural Language Models
    Liu, Qiongqiong
    Liu, Tianqiao
    Zhao, Jiafu
    Fang, Qiang
    Ding, Wenbiao
    Wu, Zhongqin
    Xia, Feng
    Tang, Jiliang
    Liu, Zitao
    [J]. ARTIFICIAL INTELLIGENCE IN EDUCATION (AIED 2021), PT II, 2021, 12749 : 256 - 261