Semantic classification of biomedical concepts using distributional similarity

被引:22
|
作者
Fan, Jung-Wei [1 ]
Friedman, Carol [1 ]
机构
[1] Columbia Univ, Dept Biomed Informat, New York, NY USA
关键词
D O I
10.1197/jamia.M2314
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Objective: To develop an automated, high-throughput, and reproducible method for reclassifying and validating ontological concepts for natural language processing applications. Design: We developed a distributional similarity approach to classify the Unified Medical Language System (UMLS) concepts. Classification models were built for seven broad biomedically relevant semantic classes created by grouping subsets of the UMLS semantic types. We used contextual features based on syntactic properties obtained from two different large corpora and used alpha-skew divergence as the similarity measure. Measurements: The testing sets were automatically generated based on the changes by the National Library of Medicine to the semantic classification of concepts from the UMLS 2005AA to the 2006AA release. Error rates were calculated and a misclassification analysis was performed. Results: The estimated lowest error rates were 0.198 and 0.116 when considering the correct classification to be covered by our top prediction and top 2 predictions, respectively. Conclusion: The results demonstrated that the distributional similarity approach can recommend high level semantic classification suitable for use in natural language processing.
引用
收藏
页码:467 / 477
页数:11
相关论文
共 50 条
  • [31] Multi-domain semantic similarity in biomedical research
    João D. Ferreira
    Francisco M. Couto
    [J]. BMC Bioinformatics, 20
  • [32] Improving Polarity Classification for Financial News Using Semantic Similarity Techniques
    Tan Li Im
    Phang Wai San
    Anthony, Patricia
    On, Chin Kim
    [J]. INTERNATIONAL JOURNAL OF INTELLIGENT INFORMATION TECHNOLOGIES, 2018, 14 (04) : 39 - 54
  • [33] Document classification using n-gram and word semantic similarity
    Ren, Mei-Ying
    Kang, Sinjae
    [J]. International Journal of Future Generation Communication and Networking, 2015, 8 (08): : 111 - 118
  • [34] Using contextual and lexical features to restructure and validate the classification of biomedical concepts
    Jung-Wei Fan
    Hua Xu
    Carol Friedman
    [J]. BMC Bioinformatics, 8
  • [35] Using contextual and lexical features to restructure and validate the classification of biomedical concepts
    Fan, Jung-Wei
    Xu, Hua
    Friedman, Carol
    [J]. BMC BIOINFORMATICS, 2007, 8 (1)
  • [36] Semantic Classification of Heterogeneous Urban Scenes Using Intrascene Feature Similarity and Interscene Semantic Dependency
    Zhang, Xiuyuan
    Du, Shihong
    Wang, Yi-Chen
    [J]. IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING, 2015, 8 (05) : 2005 - 2014
  • [37] On the Concepts of Identity and Similarity in the Context of Biomedical Record Linkage
    Sariyar, Murat
    Holm, Jurgen
    [J]. PUBLIC HEALTH AND INFORMATICS, PROCEEDINGS OF MIE 2021, 2021, 281 : 472 - 476
  • [38] Linked Forests: Semantic similarity of geographical concepts "forest"
    Cerba, Otakar
    Jedlicka, Karel
    [J]. OPEN GEOSCIENCES, 2016, 8 (01): : 556 - 566
  • [39] A Method for Measuring Semantic Similarity of Concepts in the Same Ontology
    Xu, Xiang-hua
    Huang, Jia-lai
    Wan, Jian
    Jiang, Cong-feng
    [J]. 2008 INTERNATIONAL MULTISYMPOSIUMS ON COMPUTER AND COMPUTATIONAL SCIENCES (IMSCCS), 2008, : 207 - 213
  • [40] Semantic Similarity between Concepts based on OWL Ontologies
    Xiao Min
    Zhong Luo
    Xiong Qianxing
    [J]. WKDD: 2009 SECOND INTERNATIONAL WORKSHOP ON KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS, 2009, : 749 - +