Semantic similarity using first and second order co-occurrence matrices and information content vectors

被引:0
|
作者
Pesaranghader, Ahmad [1 ]
Muthaiyah, Saravanan [2 ]
机构
[1] Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
[2] Faculty of Management, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia
来源
WSEAS Transactions on Computers | 2013年 / 12卷 / 03期
关键词
Natural language processing systems - Medical information systems - Ontology;
D O I
暂无
中图分类号
学科分类号
摘要
Massiveness of data on the Web demands automated Knowledge Engineering techniques enabling machines to achieve integrated definition of all available data to make a unique understanding of all discrete data sources. This research deals with Measures of Semantic Similarity resolving foregoing issue. These measures are widely used in ontology alignment, information retrieval and natural language processing. The study also introduces new normalized functions based on first and second order context and information content vectors of concepts in a corpus. By applying these measures to Unified Medical Language System (UMLS) using WordNet as a general taxonomy and MEDLINE abstract as the corpus to extract information content and information content vectors, these functions get evaluated against a created test bed of 301 biomedical concept pairs scored by medical residents. The paper shows newly proposed Semantic Similarity Measures outperform previous functions.
引用
下载
收藏
页码:95 / 104
相关论文
共 50 条
  • [1] Semantic Relation Discovery by Using Co-occurrence Information
    Schulz, Stefan
    Costa, Catalina Martinez
    Kreuzthaler, Markus
    Minarro-Gimenez, Jose Antonio
    Andersen, Ulrich
    Jensen, Anders Boeck
    Maegaard, Bente
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014,
  • [2] Comparison of Semantic Similarity for Different Languages Using the Google n-gram Corpus and Second-Order Co-occurrence Measures
    Joubarne, Colette
    Inkpen, Diana
    ADVANCES IN ARTIFICIAL INTELLIGENCE, 2011, 6657 : 216 - 221
  • [3] Semantic Hashtag Relation Classification Using Co-occurrence Word Information
    Sungwon Seo
    Jong-Kook Kim
    Sung-Il Kim
    Jeewoo Kim
    Joongheon Kim
    Wireless Personal Communications, 2019, 107 : 1355 - 1365
  • [4] Semantic Hashtag Relation Classification Using Co-occurrence Word Information
    Seo, Sungwon
    Kim, Jong-Kook
    Kim, Sung-Il
    Kim, Jeewoo
    Kim, Joongheon
    WIRELESS PERSONAL COMMUNICATIONS, 2019, 107 (03) : 1355 - 1365
  • [5] Semantic Hashtag Relation Classification Using Co-occurrence Word Information
    Seo, Sungwon
    Kim, Jong-Kook
    Choi, Lynn
    2017 NINTH INTERNATIONAL CONFERENCE ON UBIQUITOUS AND FUTURE NETWORKS (ICUFN 2017), 2017, : 860 - 862
  • [6] Improving the Keyword Co-occurrence Analysis: An Integrated Semantic Similarity Approach
    Bhuyan, A.
    Sanguri, K.
    Sharma, H.
    2021 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT (IEEE IEEM21), 2021, : 482 - 487
  • [7] Computing Text Semantic Similarity with Syntactic Network of Co-occurrence Distance
    Jiao Y.
    Jing M.
    Kang F.
    Data Analysis and Knowledge Discovery, 2019, 3 (12) : 93 - 100
  • [8] Co-occurrence and Semantic Similarity Based Hybrid Approach for Improving Automatic Query Expansion in Information Retrieval
    Singh, Jagendra
    Sharan, Aditi
    DISTRIBUTED COMPUTING AND INTERNET TECHNOLOGY, ICDCIT 2015, 2015, 8956 : 415 - 418
  • [9] Improving Gloss Vector Semantic Relatedness Measure by Integrating Pointwise Mutual Information Optimizing Second-order Co-occurrence Vectors computed from Biomedical Corpus and UMLS
    Pesaranghader, Ahmad
    Muthaiyah, Saravanan
    Pesaranghader, Ali
    2013 INTERNATIONAL CONFERENCE ON INFORMATICS AND CREATIVE MULTIMEDIA (ICICM), 2013, : 196 - 201
  • [10] Semantic information retrieval research based on co-occurrence analysis
    Lou, Wen
    Qiu, Junping
    ONLINE INFORMATION REVIEW, 2014, 38 (01) : 4 - 23