Ontology Alignment Based on Word Embedding and Random Forest Classification

被引:9
|
作者
Nkisi-Orji, Ikechukwu [1 ]
Wiratunga, Nirmalie [1 ]
Massie, Stewart [1 ]
Hui, Kit-Ying [1 ]
Heaven, Rachel [2 ]
机构
[1] Robert Gordon Univ, Aberdeen, Scotland
[2] British Geol Survey, Nottingham, England
关键词
Ontology alignment; Word embedding; Machine classification; Semantic web; AGGREGATION;
D O I
10.1007/978-3-030-10925-7_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. Most alignment techniques depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding for determining a variety of semantic similarity features between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts align. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can handle knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of a composition of similarity measures. Experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems. Code related to this paper is available at: https://bitbucket.org/paravariar/rafcom.
引用
收藏
页码:557 / 572
页数:16
相关论文
共 50 条
  • [21] Improving Text Classification with Word Embedding
    Ge, Lihao
    Moh, Teng-Sheng
    2017 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2017, : 1796 - 1805
  • [22] Incorporating Word Embedding and Hybrid Model Random Forest Softmax Regression for Predicting News Categories
    Khosa, Saima
    Rustam, Furqan
    Mehmood, Arif
    Choi, Gyu Sang
    Ashraf, Imran
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (11) : 31279 - 31295
  • [23] Incorporating Word Embedding and Hybrid Model Random Forest Softmax Regression for Predicting News Categories
    Saima Khosa
    Furqan Rustam
    Arif Mehmood
    Gyu Sang Choi
    Imran Ashraf
    Multimedia Tools and Applications, 2024, 83 : 31279 - 31295
  • [24] Transportation sentiment analysis using word embedding and ontology-based topic modeling
    Ali, Farman
    Kwak, Daehan
    Khan, Pervez
    El-Sappagh, Shaker
    Ali, Amjad
    Ullah, Sana
    Kim, Kye Hyun
    Kwak, Kyung-Sup
    KNOWLEDGE-BASED SYSTEMS, 2019, 174 : 27 - 42
  • [25] Gromov-Wasserstein Alignment of Word Embedding Spaces
    Alvarez-Melis, David
    Jaakkola, Tommi S.
    2018 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2018), 2018, : 1881 - 1890
  • [26] Challenges and Solutions with Alignment and Enrichment of Word Embedding Models
    Sahin, Cem Safak
    Caceres, Rajmonda S.
    Oselio, Brandon
    Campbell, William M.
    NATURAL LANGUAGE PROCESSING AND INFORMATION SYSTEMS, NLDB 2017, 2017, 10260 : 260 - 266
  • [27] Image classification by Distortion-Free Graph Embedding and KNN-Random forest
    Temir, Askhat
    Artykbayev, Kamalkhan
    Demirci, M. Fatih
    2020 17TH CONFERENCE ON COMPUTER AND ROBOT VISION (CRV 2020), 2020, : 33 - 38
  • [28] Word Embedding-based Web Service Representations for Classification and Clustering
    Zhang, Xiangping
    Liu, Jianxun
    Shi, Min
    Cao, Buqing
    2021 IEEE INTERNATIONAL CONFERENCE ON SERVICES COMPUTING (SCC 2021), 2021, : 34 - 43
  • [29] Study on the Chinese Word Semantic Relation Classification with Word Embedding
    Shijia, E.
    Jia, Shengbin
    Xiang, Yang
    NATURAL LANGUAGE PROCESSING AND CHINESE COMPUTING, NLPCC 2017, 2018, 10619 : 849 - 855
  • [30] Random forest forest solar power forecast based on classification optimization
    Liu, Da
    Sun, Kun
    ENERGY, 2019, 187