Linking Datasets Using Semantic Textual Similarity

被引:12
|
作者
McCrae, John P. [1 ]
Buitelaar, Paul [1 ]
机构
[1] Natl Univ Ireland Galway, Insight Ctr Data Analyt, Galway H91 A06C, Ireland
基金
欧盟地平线“2020”;
关键词
Linked data; link discovery; ontology alignment; semantic textual similarity; structural similarity; NLP architectures;
D O I
10.2478/cait-2018-0010
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Linked data has been widely recognized as an important paradigm for representing data and one of the most important aspects of supporting its use is discovery of links between datasets. For many datasets, there is a significant amount of textual information in the form of labels, descriptions and documentation about the elements of the dataset and the fundament of a precise linking is in the application of semantic textual similarity to link these datasets. However, most linking tools so far rely on only simple string similarity metrics such as Jaccard scores. We present an evaluation of some metrics that have performed well in recent semantic textual similarity evaluations and apply these to linking existing datasets.
引用
收藏
页码:109 / 123
页数:15
相关论文
共 50 条
  • [1] Linguistic analysis of datasets for semantic textual similarity
    Wang, Chunlin
    Castellon, Irene
    Comelles, Elisabet
    DIGITAL SCHOLARSHIP IN THE HUMANITIES, 2020, 35 (02) : 471 - 484
  • [2] Phrase-based Semantic Textual Similarity for Linking Researchers
    Reyes-Ortiz, Jose A.
    Bravo, Maricela
    Padilla, Omar E.
    2015 26TH INTERNATIONAL WORKSHOP ON DATABASE AND EXPERT SYSTEMS APPLICATIONS (DEXA), 2015, : 202 - 206
  • [3] Efficient Textual Similarity using Semantic MinHashing
    Nawaz, Waqas
    Baig, Maryam
    Khan, Kifayat Ullah
    2024 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING, IEEE BIGCOMP 2024, 2024, : 262 - 269
  • [4] Semantic Textual Similarity Using Various Approaches
    Kazula, Maciej
    Kozlowski, Marek
    MACHINE INTELLIGENCE AND BIG DATA IN INDUSTRY, 2016, 19 : 49 - 62
  • [5] Question Similarity Detection in Turkish Using Semantic Textual Similarity Methods
    Yildiz, Eray
    Findik, Yasin
    2019 27TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU), 2019,
  • [6] Calculation of Textual Similarity Using Semantic Relatedness Functions
    Kairaldeen, Ammar Riadh
    Ercan, Gonenc
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT II, 2015, 9042 : 516 - 524
  • [7] Multilingual Semantic Textual Similarity using Multilingual Word Representations
    Ahmed, Mahtab
    Dixit, Chahna
    Mercer, Robert E.
    Khan, Atif
    Samee, Muhammad Rifayat
    Urra, Felipe
    2020 IEEE 14TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING (ICSC 2020), 2020, : 194 - 198
  • [8] A Fast Similarity Search kNN for Textual Datasets
    Amorim, Leonardo Afonso
    Freitas, Mateus F.
    da Silva, Paulo Henrique
    Martins, Wellington S.
    2018 SYMPOSIUM ON HIGH PERFORMANCE COMPUTING SYSTEMS (WSCAD 2018), 2018, : 229 - 236
  • [9] Influence of Token Similarity Measures for Semantic Textual Similarity
    Sowmya, V.
    Vardhan, Vishnu B.
    Raju, Bhadri M. S. V. S.
    2016 IEEE 6TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTING (IACC), 2016, : 41 - 44
  • [10] FlexSTS: A Framework for Semantic Textual Similarity
    Freire, Janio
    Pinheiro, Vadia
    Feitosa, David
    LINGUAMATICA, 2016, 8 (02): : 23 - 31