Ontology Alignment Based on Word Embedding and Random Forest Classification

被引:9
|
作者
Nkisi-Orji, Ikechukwu [1 ]
Wiratunga, Nirmalie [1 ]
Massie, Stewart [1 ]
Hui, Kit-Ying [1 ]
Heaven, Rachel [2 ]
机构
[1] Robert Gordon Univ, Aberdeen, Scotland
[2] British Geol Survey, Nottingham, England
关键词
Ontology alignment; Word embedding; Machine classification; Semantic web; AGGREGATION;
D O I
10.1007/978-3-030-10925-7_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. Most alignment techniques depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding for determining a variety of semantic similarity features between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts align. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can handle knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of a composition of similarity measures. Experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems. Code related to this paper is available at: https://bitbucket.org/paravariar/rafcom.
引用
收藏
页码:557 / 572
页数:16
相关论文
共 50 条
  • [1] Improving Medical Ontology Based on Word Embedding
    Gao, Mingxia
    Chen, Furong
    Wang, Rifeng
    PROCEEDINGS OF 2018 6TH INTERNATIONAL CONFERENCE ON BIOINFORMATICS AND COMPUTATIONAL BIOLOGY (ICBCB 2018), 2018, : 121 - 127
  • [2] Document Sentiment Classification based on the Word Embedding
    Yin, Yanping
    Jin, Zhong
    PROCEEDINGS OF THE 4TH INTERNATIONAL CONFERENCE ON MECHATRONICS, MATERIALS, CHEMISTRY AND COMPUTER ENGINEERING 2015 (ICMMCCE 2015), 2015, 39 : 456 - 461
  • [3] Topic Classification Based on Improved Word Embedding
    Sheng, Liangliang
    Xu, Lizhen
    2017 14TH WEB INFORMATION SYSTEMS AND APPLICATIONS CONFERENCE (WISA 2017), 2017, : 117 - 121
  • [4] Closed form word embedding alignment
    Sunipa Dev
    Safia Hassan
    Jeff M. Phillips
    Knowledge and Information Systems, 2021, 63 : 565 - 588
  • [5] Closed form word embedding alignment
    Dev, Sunipa
    Hassan, Safia
    Phillips, Jeff M.
    KNOWLEDGE AND INFORMATION SYSTEMS, 2021, 63 (03) : 565 - 588
  • [6] Closed Form Word Embedding Alignment
    Dev, Sunipa
    Hassan, Safia
    Phillips, Jeff M.
    2019 19TH IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM 2019), 2019, : 130 - 139
  • [7] Inter project defect classification based on word embedding
    Kumar, Sushil
    Sharma, Meera
    Muttoo, S. K.
    Singh, V. B.
    INTERNATIONAL JOURNAL OF SYSTEM ASSURANCE ENGINEERING AND MANAGEMENT, 2024, 15 (02) : 621 - 634
  • [8] Inter project defect classification based on word embedding
    Sushil Kumar
    Meera Sharma
    S. K. Muttoo
    V. B. Singh
    International Journal of System Assurance Engineering and Management, 2024, 15 : 621 - 634
  • [9] Illuminant Classification based on Random Forest
    Liu, Bozhi
    Qiu, Guoping
    2015 14TH IAPR INTERNATIONAL CONFERENCE ON MACHINE VISION APPLICATIONS (MVA), 2015, : 106 - 109
  • [10] Ontology-aided Word2vec based Synonym Identification for Ontology Alignment
    Kim, Minhwan
    Kim, Jongmo
    Kim, Kunyoung
    Sohn, Mye
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA AND SMART COMPUTING (BIGCOMP 2020), 2020, : 429 - 431