Ontology Alignment Based on Word Embedding and Random Forest Classification

被引:9
|
作者
Nkisi-Orji, Ikechukwu [1 ]
Wiratunga, Nirmalie [1 ]
Massie, Stewart [1 ]
Hui, Kit-Ying [1 ]
Heaven, Rachel [2 ]
机构
[1] Robert Gordon Univ, Aberdeen, Scotland
[2] British Geol Survey, Nottingham, England
关键词
Ontology alignment; Word embedding; Machine classification; Semantic web; AGGREGATION;
D O I
10.1007/978-3-030-10925-7_34
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ontology alignment is crucial for integrating heterogeneous data sources and forms an important component of the semantic web. Accordingly, several ontology alignment techniques have been proposed and used for discovering correspondences between the concepts (or entities) of different ontologies. Most alignment techniques depend on string-based similarities which are unable to handle the vocabulary mismatch problem. Also, determining which similarity measures to use and how to effectively combine them in alignment systems are challenges that have persisted in this area. In this work, we introduce a random forest classifier approach for ontology alignment which relies on word embedding for determining a variety of semantic similarity features between concepts. Specifically, we combine string-based and semantic similarity measures to form feature vectors that are used by the classifier model to determine when concepts align. By harnessing background knowledge and relying on minimal information from the ontologies, our approach can handle knowledge-light ontological resources. It also eliminates the need for learning the aggregation weights of a composition of similarity measures. Experiments using Ontology Alignment Evaluation Initiative (OAEI) dataset and real-world ontologies highlight the utility of our approach and show that it can outperform state-of-the-art alignment systems. Code related to this paper is available at: https://bitbucket.org/paravariar/rafcom.
引用
收藏
页码:557 / 572
页数:16
相关论文
共 50 条
  • [41] Image Classification Based on Improved Random Forest Algorithm
    Man, Weishi
    Ji, Yuanyuan
    Zhang, Zhiyu
    2018 IEEE 3RD INTERNATIONAL CONFERENCE ON CLOUD COMPUTING AND BIG DATA ANALYSIS (ICCCBDA), 2018, : 346 - 350
  • [42] RANDOM FOREST CLASSIFICATION BASED ACOUSTIC EVENT DETECTION
    Xia, Xianjun
    Togneri, Roberto
    Sohel, Ferdous
    Huang, David
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2017, : 163 - 168
  • [43] Random Forest based Traffic Classification Method In SDN
    Zhai, Yubo
    Zheng, Xianghan
    2018 INTERNATIONAL CONFERENCE ON CLOUD COMPUTING, BIG DATA AND BLOCKCHAIN (ICCBB 2018), 2018, : 66 - 70
  • [44] Automated Patent Classification Using Word Embedding
    Grawe, Mattyws F.
    Martins, Claudia A.
    Bonfante, Andreia G.
    2017 16TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA), 2017, : 408 - 411
  • [45] Citation Intent Classification Using Word Embedding
    Roman, Muhammad
    Shahid, Abdul
    Khan, Shafiullah
    Koubaa, Anis
    Yu, Lisu
    IEEE ACCESS, 2021, 9 : 9982 - 9995
  • [46] A Weighted Word Embedding Model for Text Classification
    Ren, Haopeng
    Zeng, ZeQuan
    Cai, Yi
    Du, Qing
    Li, Qing
    Xie, Haoran
    DATABASE SYSTEMS FOR ADVANCED APPLICATIONS (DASFAA 2019), PT I, 2019, 11446 : 419 - 434
  • [47] Ontology-Based Enhanced Word Embedding for Automated Information Extraction from Geoscience Reports
    Qiu, Qinjun
    Xie, Zhong
    2018 26TH INTERNATIONAL CONFERENCE ON GEOINFORMATICS (GEOINFORMATICS 2018), 2018,
  • [48] Comparing Pretrained Multilingual Word Embeddings on an Ontology Alignment Task
    Gromann, Dagmar
    Declerck, Thierry
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 230 - 236
  • [49] Protein embedding based alignment
    Iovino, Benjamin Giovanni
    Ye, Yuzhen
    BMC BIOINFORMATICS, 2024, 25 (01)
  • [50] Protein embedding based alignment
    Benjamin Giovanni Iovino
    Yuzhen Ye
    BMC Bioinformatics, 25