Measures of semantic similarity and relatedness in the biomedical domain

被引:316
|
作者
Pedersen, Ted
Pakhomov, Serguei V. S.
Patwardhan, Siddharth
Chute, Christopher G.
机构
[1] Univ Minnesota, Dept Comp Sci, Duluth, MN 55812 USA
[2] Mayo Coll Med, Div Biomed Informat, Rochester, MN USA
[3] Univ Utah, Sch Comp, Salt Lake City, UT 84112 USA
关键词
semantic similarity; path based measures; information content; context vectors; SNOMED-CT;
D O I
10.1016/j.jbi.2006.06.004
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Measures of semantic similarity between concepts are widely used in Natural Language Processing. In this article, we show how six existing domain-independent measures can be adapted to the biomedical domain. These measures were originally based on WordNet, an English lexical database of concepts and relations. In this research, we adapt these measures to the SNOMED-CT (R) ontology of medical concepts. The measures include two path-based measures, and three measures that augment path-based measures with information content statistics from corpora. We also derive a context vector measure based on medical corpora that can be used as a measure of semantic relatedness. These six measures are evaluated against a newly created test bed of 30 medical concept pairs scored by three physicians and nine medical coders. We find that the medical coders and physicians differ in their ratings, and that the context vector measure correlates most closely with the physicians, while the path-based measures and one of the information content measures correlates most closely with the medical coders. We conclude that there is a role both for more flexible measures of relatedness based on information derived from corpora, as well as for measures that rely on existing ontological structures. (C) 2006 Elsevier Inc. All rights reserved.
引用
收藏
页码:288 / 299
页数:12
相关论文
共 50 条
  • [21] Exploiting Taxonomical Knowledge to Compute Semantic Similarity: An Evaluation in the Biomedical Domain
    Batet, Montseriat
    Sanchez, David
    Valls, Aida
    Gibert, Karina
    TRENDS IN APPLIED INTELLIGENT SYSTEMS, PT I, PROCEEDINGS, 2010, 6096 : 274 - +
  • [22] Neural sentence embedding models for semantic similarity estimation in the biomedical domain
    Kathrin Blagec
    Hong Xu
    Asan Agibetov
    Matthias Samwald
    BMC Bioinformatics, 20
  • [23] New ontology-based semantic similarity measure for the biomedical domain
    Nguyen, Hoa A.
    Al-Mubaid, Hisham
    2006 IEEE INTERNATIONAL CONFERENCE ON GRANULAR COMPUTING, 2006, : 623 - +
  • [24] Semantic Similarity in Biomedical Ontologies
    Pesquita, Catia
    Faria, Daniel
    Falcao, Andre O.
    Lord, Phillip
    Couto, Francisco M.
    PLOS COMPUTATIONAL BIOLOGY, 2009, 5 (07)
  • [25] Supervised Biomedical Semantic Similarity
    Sousa, Rita. T. T.
    Silva, Sara
    Pesquita, Catia
    IEEE ACCESS, 2023, 11 : 60635 - 60645
  • [26] Explicit Semantic Analysis for Computing Semantic Relatedness of Biomedical Text
    Jaiswal, Ayush
    Bhargava, Anunay
    2014 5TH INTERNATIONAL CONFERENCE CONFLUENCE THE NEXT GENERATION INFORMATION TECHNOLOGY SUMMIT (CONFLUENCE), 2014, : 929 - 934
  • [27] African Wordnet as a tool to identify semantic relatedness and semantic similarity
    Madonsela, Stanley
    SOUTH AFRICAN JOURNAL OF AFRICAN LANGUAGES, 2019, 39 (02) : 185 - 190
  • [28] Enabling semantic similarity estimation across multiple ontologies: An evaluation in the biomedical domain
    Sanchez, David
    Sole-Ribalta, Albert
    Batet, Montserrat
    Serratosa, Francesc
    JOURNAL OF BIOMEDICAL INFORMATICS, 2012, 45 (01) : 141 - 155
  • [29] Domain-Specific Semantic Relatedness from Wikipedia Structure: A Case Study in Biomedical Text
    Sajadi, Armin
    Milios, Evangelos E.
    Keselj, Vlado
    Janssen, Jeannette C. M.
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 347 - 360
  • [30] Semantic relatedness and similarity of biomedical terms: examining the effects of recency, size, and section of biomedical publications on the performance of word2vec
    Zhu, Yongjun
    Yan, Erjia
    Wang, Fei
    BMC MEDICAL INFORMATICS AND DECISION MAKING, 2017, 17