A comparative evaluation and analysis of three generations of Distributional Semantic Models

被引:22
|
作者
Lenci, Alessandro [1 ]
Sahlgren, Magnus [2 ]
Jeuniaux, Patrick [3 ]
Gyllensten, Amaru Cuba [4 ]
Miliani, Martina [1 ,5 ]
机构
[1] Univ Pisa, Pisa, Italy
[2] AI Sweden, Stockholm, Sweden
[3] Inst Natl Criminalist & Criminol, Brussels, Belgium
[4] RISE, Stockholm, Sweden
[5] Univ Stranieri Siena, Siena, Italy
关键词
Distributional semantics; Evaluation; Contextual embeddings; Representational Similarity Analysis; WORD COOCCURRENCE STATISTICS; REPRESENTATIONS;
D O I
10.1007/s10579-021-09575-z
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Distributional semantics has deeply changed in the last decades. First, predict models stole the thunder from traditional count ones, and more recently both of them were replaced in many NLP applications by contextualized vectors produced by neural language models. Although an extensive body of research has been devoted to Distributional Semantic Model (DSM) evaluation, we still lack a thorough comparison with respect to tested models, semantic tasks, and benchmark datasets. Moreover, previous work has mostly focused on task-driven evaluation, instead of exploring the differences between the way models represent the lexical semantic space. In this paper, we perform a large-scale evaluation of type distributional vectors, either produced by static DSMs or obtained by averaging the contextualized vectors generated by BERT. First of all, we investigate the performance of embeddings in several semantic tasks, carrying out an in-depth statistical analysis to identify the major factors influencing the behavior of DSMs. The results show that (i) the alleged superiority of predict based models is more apparent than real, and surely not ubiquitous and (ii) static DSMs surpass BERT representations in most out-of-context semantic tasks and datasets. Furthermore, we borrow from cognitive neuroscience the methodology of Representational Similarity Analysis (RSA) to inspect the semantic spaces generated by distributional models. RSA reveals important differences related to the frequency and part-of-speech of lexical items.
引用
收藏
页码:1269 / 1313
页数:45
相关论文
共 50 条
  • [1] A comparative evaluation and analysis of three generations of Distributional Semantic Models
    Alessandro Lenci
    Magnus Sahlgren
    Patrick Jeuniaux
    Amaru Cuba Gyllensten
    Martina Miliani
    [J]. Language Resources and Evaluation, 2022, 56 : 1269 - 1313
  • [2] Distributional Semantic Models for Affective Text Analysis
    Malandrakis, Nikolaos
    Potamianos, Alexandros
    Iosif, Elias
    Narayanan, Shrikanth
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (11): : 2379 - 2392
  • [3] A SICK cure for the evaluation of compositional distributional semantic models
    Marelli, M.
    Menini, S.
    Baroni, M.
    Bentivogli, L.
    Bernardi, R.
    Zamparelli, R.
    [J]. LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [4] Semantic Relata for the Evaluation of Distributional Models in Mandarin Chinese
    Liu, Hongchao
    Chersoni, Emmanuele
    Klyueva, Natalia
    Santus, Enrico
    Huang, Chu-Ren
    [J]. IEEE ACCESS, 2019, 7 : 145705 - 145713
  • [5] AGREE: a new benchmark for the evaluation of distributional semantic models of ancient Greek
    Stopponi, Silvia
    Peels-Matthey, Saskia
    Nissim, Malvina
    [J]. DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2024, 39 (01) : 373 - 392
  • [6] Three generations of crises, three generations of crisis models
    Eichengreen, B
    [J]. JOURNAL OF INTERNATIONAL MONEY AND FINANCE, 2003, 22 (07) : 1089 - 1094
  • [7] Evaluation of Distributional Semantic Models for the Extraction of Semantic Relations for Named Rivers from a Small Specialized Corpus
    Rojas Garcia, Juan
    Faber, Pamela
    [J]. PROCESAMIENTO DEL LENGUAJE NATURAL, 2019, (63): : 51 - 58
  • [8] Regularized Training of Compositional Distributional Semantic Models
    Yang, Xuefeng
    Mao, Kezhi
    Zhao, Rui
    [J]. 2015 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS AND SIGNAL PROCESSING (ICICS), 2015,
  • [9] A Complex Network Approach to Distributional Semantic Models
    Utsumi, Akira
    [J]. PLOS ONE, 2015, 10 (08):
  • [10] Three Generations of Comparative Sociologies
    Arjomand, Said Amir
    [J]. ARCHIVES EUROPEENNES DE SOCIOLOGIE, 2011, 51 (03): : 363 - 399