Embeddings Evaluation Using a Novel Measure of Semantic Similarity

被引:0
|
作者
Anna Giabelli
Lorenzo Malandri
Fabio Mercorio
Mario Mezzanzanica
Navid Nobani
机构
[1] Univ. of Milan-Bicocca,Dept. of Informatics, Systems & Communication
[2] University of Milano Bicocca,CRISP Research Centre
[3] Univ. of Milan-Bicocca,Dept. of Statistics and Quantitative Methods
[4] Digital Attitude,undefined
来源
Cognitive Computation | 2022年 / 14卷
关键词
D O I
暂无
中图分类号
学科分类号
摘要
Lexical taxonomies and distributional representations are largely used to support a wide range of NLP applications, including semantic similarity measurements. Recently, several scholars have proposed new approaches to combine those resources into unified representation preserving distributional and knowledge-based lexical features. In this paper, we propose and implement TaxoVec, a novel approach to selecting word embeddings based on their ability to preserve taxonomic similarity. In TaxoVec, we first compute the pairwise semantic similarity between taxonomic words through a new measure we previously developed, the Hierarchical Semantic Similarity (HSS), which we show outperforms previous measures on several benchmark tasks. Then, we train several embedding models on a text corpus and select the best model, that is, the model that maximizes the correlation between the HSS and the cosine similarity of the pair of words that are in both the taxonomy and the corpus. To evaluate TaxoVec, we repeat the embedding selection process using three other semantic similarity benchmark measures. We use the vectors of the four selected embeddings as machine learning model features to perform several NLP tasks. The performances of those tasks constitute an extrinsic evaluation of the criteria for the selection of the best embedding (i.e. the adopted semantic similarity measure). Experimental results show that (i) HSS outperforms state-of-the-art measures for measuring semantic similarity in taxonomy on a benchmark intrinsic evaluation and (ii) the embedding selected through TaxoVec achieves a clear victory against embeddings selected by the competing measures on benchmark NLP tasks. We implemented the HSS, together with other benchmark measures of semantic similarity, as a full-fledged Python package called TaxoSS, whose documentation is available at https://pypi.org/project/TaxoSS.
引用
收藏
页码:749 / 763
页数:14
相关论文
共 50 条
  • [1] Embeddings Evaluation Using a Novel Measure of Semantic Similarity
    Giabelli, Anna
    Malandri, Lorenzo
    Mercorio, Fabio
    Mezzanzanica, Mario
    Nobani, Navid
    [J]. COGNITIVE COMPUTATION, 2022, 14 (02) : 749 - 763
  • [2] Novel metrics for computing semantic similarity with sense embeddings
    Colla, Davide
    Mensa, Enrico
    Radicioni, Daniele P.
    [J]. KNOWLEDGE-BASED SYSTEMS, 2020, 206
  • [3] A Novel Semantic Similarity Measure within Sentences
    Li, Yanni
    Li, Haisheng
    Cai, Qiang
    Han, Dongmei
    [J]. PROCEEDINGS OF 2012 2ND INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2012), 2012, : 1176 - 1179
  • [4] Semantic similarity is not enough: A novel NLP-based semantic similarity measure in context
    Abbasi, Omid Reza
    Alesheikh, Ali Asghar
    Lotfata, Aynaz
    [J]. ISCIENCE, 2024, 27 (06)
  • [5] A novel method to measure the semantic similarity of HPO terms
    Peng, Jiajie
    Xue, Hansheng
    Shao, Yukai
    Shang, Xuequn
    Wang, Yadong
    Chen, Jin
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2017, 17 (02) : 173 - 188
  • [6] Sequential Sentence Embeddings for Semantic Similarity
    Carta, Antonio
    Bacciu, Davide
    [J]. 2019 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (IEEE SSCI 2019), 2019, : 1354 - 1361
  • [7] Semantic Similarity between Turkish and European Languages Using Word Embeddings
    Senel, Lutfi Kerem
    Yucesoy, Veysel
    Koc, Aykut
    Cukur, Tolga
    [J]. 2017 25TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2017,
  • [8] Using Standardized Lexical Semantic Knowledge to Measure Similarity
    Wali, Wafa
    Gargouri, Bilel
    Ben Hamadou, Abdelmajid
    [J]. KNOWLEDGE SCIENCE, ENGINEERING AND MANAGEMENT, KSEM 2014, 2014, 8793 : 93 - 104
  • [9] Web Search Personalization Using Semantic Similarity Measure
    Sharma, Sunny
    Rana, Vijay
    [J]. PROCEEDINGS OF RECENT INNOVATIONS IN COMPUTING, ICRIC 2019, 2020, 597 : 273 - 288
  • [10] Novel Approach to Find Semantic Similarity Measure between Words
    Sahni, Lakshay
    Sehgal, Anubhav
    Kochar, Shaivi
    Ahmad, Faiyaz
    Ahmad, Tanvir
    [J]. PROCEEDINGS OF 2014 2ND INTERNATIONAL SYMPOSIUM ON COMPUTATIONAL AND BUSINESS INTELLIGENCE (ISCBI), 2014, : 89 - 92