Head to Head: Semantic Similarity of Multi-Word Terms

被引:3
|
作者
Spasic, Irena [1 ]
Corcoran, Padraig [1 ]
Gagarin, Andrei [2 ]
Buerki, Andreas [3 ]
机构
[1] Cardiff Univ, Sch Comp Sci & Informat, Cardiff CF10 4PE, S Glam, Wales
[2] Cardiff Univ, Sch Math, Cardiff CF24 4AG, S Glam, Wales
[3] Cardiff Univ, Sch English Commun & Philosophy, Cardiff CF10 3EU, S Glam, Wales
来源
IEEE ACCESS | 2018年 / 6卷
关键词
Semantic similarity; natural language processing; clustering methods; knowledge acquisition; TERMINOLOGY; RECOGNITION; EVOLVIEW; CORPUS;
D O I
10.1109/ACCESS.2018.2826224
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Terms are linguistic signifiers of domain specific concepts. Semantic similarity between terms refers to the corresponding distance in the conceptual space. In this paper, we use lexico-syntactic information to define a vector space representation in which cosine similarity closely approximates semantic similarity between the corresponding terms. Given a multi word term, each word is weighed in terms of its defining properties. In this context, the head noun is given the highest weight. Other words are weighed depending on their relations to the head noun. We formalized the problem as that of determining a topological ordering of a direct acyclic graph, which is based on constituency and dependency relations within a noun phrase. To counteract the errors associated with automatically inferred constituency and dependency relations, we implemented a heuristic approach to approximating the topological ordering. Different weights are assigned to different words based on their positions. Clustering experiments performed on such a vector space representation showed considerable improvement over the conventional bag of word representation. Specifically, it more consistently reflected semantic similarity between the terms. This was established by analyzing the differences between automatically generated dendrograms and manually constructed taxonomies. In conclusion, our method can be used to semi automate taxonomy construction.
引用
收藏
页码:20545 / 20557
页数:13
相关论文
共 50 条
  • [1] Semantic prosody and semantic preference in multi-word terms
    Cabezas-Garcia, Melania
    Faber, Pamela
    [J]. FACHSPRACHE-JOURNAL OF PROFESSIONAL AND SCIENTIFIC COMMUNICATION, 2019, 41 (1-2): : 2 - 21
  • [2] Vector representations of multi-word terms for semantic relatedness
    Henry, Sam
    Cuffy, Clint
    McInnes, Bridget T.
    [J]. JOURNAL OF BIOMEDICAL INFORMATICS, 2018, 77 : 111 - 119
  • [3] Phonological similarity in multi-word units
    Gries, Stefan Th.
    [J]. COGNITIVE LINGUISTICS, 2011, 22 (03) : 491 - 510
  • [4] On the Structural Disambiguation of Multi-word Terms
    Cabezas-Garcia, Melania
    Leon-Arauz, Pilar
    [J]. COMPUTATIONAL AND CORPUS-BASED PHRASEOLOGY, EUROPHRAS 2019, 2019, 11755 : 46 - 60
  • [5] Exploring terminological relations between multi-word terms in distributional semantic models
    Wang, Yizhe
    Daille, Beatrice
    Hathout, Nabil
    [J]. TERMINOLOGY, 2023,
  • [6] Compositionality and lexical alignment of multi-word terms
    Emmanuel Morin
    Béatrice Daille
    [J]. Language Resources and Evaluation, 2010, 44 : 79 - 95
  • [7] Multi-word terms selection for information retrieval
    Bechikh Ali, Chedi
    Haddad, Hatem
    Slimani, Yahya
    [J]. INFORMATION DISCOVERY AND DELIVERY, 2023, 51 (01) : 74 - 87
  • [8] Compositionality and lexical alignment of multi-word terms
    Morin, Emmanuel
    Daille, Beatrice
    [J]. LANGUAGE RESOURCES AND EVALUATION, 2010, 44 (1-2) : 79 - 95
  • [9] Word Embedding Approach for Synonym Extraction of Multi-Word Terms
    Hazem, Amir
    Daille, Beatrice
    [J]. PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 297 - 303
  • [10] Extracting Chinese Multi-word Terms from Small Corpus
    Lang, Zhou
    Liang, Zhang
    Chong, Feng
    Heyan, Huang
    [J]. 2008 3RD INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEM AND KNOWLEDGE ENGINEERING, VOLS 1 AND 2, 2008, : 813 - +