Pitfalls in the Evaluation of Sentence Embeddings

被引:0
|
作者
Eger, Steffen [1 ,2 ]
Rueckle, Andreas [1 ]
Gurevych, Iryna [1 ,2 ]
机构
[1] Tech Univ Darmstadt, Dept Comp Sci, Ubiquitous Knowledge Proc Lab UKP TUDA, Darmstadt, Germany
[2] Tech Univ Darmstadt, Dept Comp Sci, Res Training Grp AIPHES, Darmstadt, Germany
关键词
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
Deep learning models continuously break new records across different NLP tasks. At the same time, their success exposes weaknesses of model evaluation. Here, we compile several key pitfalls of evaluation of sentence embeddings, a currently very popular NLP paradigm. These pitfalls include the comparison of embeddings of different sizes, normalization of embeddings, and the low (and diverging) correlations between transfer and probing tasks. Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research. Based on our insights, we also recommend better practices for better future evaluations of sentence embeddings.
引用
收藏
页码:55 / 60
页数:6
相关论文
共 50 条
  • [1] On the Dimensionality of Sentence Embeddings
    Wang, Hongwei
    Zhang, Hongming
    Yu, Dong
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (EMNLP 2023), 2023, : 10344 - 10354
  • [2] Conceptual Sentence Embeddings
    Wang, Yashen
    Huang, Heyan
    Feng, Chong
    Zhou, Qiang
    Gu, Jiahui
    WEB-AGE INFORMATION MANAGEMENT, PT I, 2016, 9658 : 390 - 401
  • [3] Efficient comparison of sentence embeddings
    Zoupanos, Spyros
    Kolovos, Stratis
    Kanavos, Athanasios
    Papadimitriou, Orestis
    Maragoudakis, Manolis
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [4] Sentence Pair Embeddings Based Evaluation Metric for Abstractive and Extractive Summarization
    Akula, Ramya
    Garibay, Ivan
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6009 - 6017
  • [5] Performance Evaluation of Word and Sentence Embeddings for Finance Headlines Sentiment Analysis
    Mishev, Kostadin
    Gjorgjevikj, Ana
    Stojanov, Riste
    Mishkovski, Igor
    Vodenska, Irena
    Chitkushev, Ljubomir
    Trajanov, Dimitar
    ICT INNOVATIONS 2019: BIG DATA PROCESSING AND MINING, 2019, 1110 : 161 - 172
  • [6] Are the Best Multilingual Document Embeddings simply Based on Sentence Embeddings?
    Sannigrahi, Sonal
    van Genabith, Josef
    Espana-Bonet, Cristina
    17TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, EACL 2023, 2023, : 2306 - 2316
  • [7] Empirical Linguistic Study of Sentence Embeddings
    Krasnowska-Kieras, Katarzyna
    Wroblewska, Alina
    57TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2019), 2019, : 5729 - 5739
  • [8] Text classification by untrained sentence embeddings
    Di Sarli, Daniele
    Gallicchio, Claudio
    Micheli, Alessio
    INTELLIGENZA ARTIFICIALE, 2020, 14 (02) : 245 - 259
  • [9] Multilevel Sentence Embeddings for Personality Prediction
    Tirotta, Paolo
    Yuasa, Akira
    Morita, Masashi
    arXiv, 2023,
  • [10] Sequential Sentence Embeddings for Semantic Similarity
    Carta, Antonio
    Bacciu, Davide
    2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 1354 - 1361