Semantic-Similarity-Based Schema Matching for Management of Building Energy Data

被引:3
|
作者
Pan, Zhiyu [1 ]
Pan, Guanchen [1 ]
Monti, Antonello [1 ,2 ]
机构
[1] Rhein Westfal TH Aachen, Inst Automation Complex Power Syst, D-52074 Aachen, Germany
[2] Fraunhofer Inst Appl Informat Technol FIT, D-53757 St Augustin, Germany
基金
欧盟地平线“2020”;
关键词
semantic similarity; schema matching; active learning;
D O I
10.3390/en15238894
中图分类号
TE [石油、天然气工业]; TK [能源与动力工程];
学科分类号
0807 ; 0820 ;
摘要
The increase in heterogeneous data in the building energy domain creates a difficult challenge for data integration. Schema matching, which maps the raw data from the building energy domain to a generic data model, is the necessary step in data integration and provides a unique representation. Only a small amount of labeled data for schema matching exists and it is time-consuming and labor-intensive to manually label data. This paper applies semantic-similarity methods to the automatic schema-mapping process by combining knowledge from natural language processing, which reduces the manual effort in heterogeneous data integration. The active-learning method is applied to solve the lack-of-labeled-data problem in schema matching. The results of the schema matching with building-energy-domain data show the pre-trained language model provides a massive improvement in the accuracy of schema matching and the active-learning method greatly reduces the amount of labeled data required.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] A semantic-similarity-based method for object description and clustering
    Xu, Jing
    Okada, Shogo
    Nitta, Katsumi
    [J]. 2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, : 3669 - 3674
  • [2] Using Semantic Similarity for Schema Matching of Semi-structured and Linked Data
    Kettouch, Mohamed Salah
    Luca, Cristina
    Hobbs, Mike
    Dascalu, Sergiu
    [J]. PROCEEDINGS OF THE 2017 7TH INTERNATIONAL CONFERENCE INTERNET TECHNOLOGIES AND APPLICATIONS (ITA), 2017, : 128 - 133
  • [3] Using ontologies for measuring semantic similarity in data warehouse schema matching process
    Banek, M.
    Vrdoljak, B.
    Tjoa, A. M.
    [J]. CONTEL 2007: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, 2007, : 227 - +
  • [4] A novel method for measuring semantic similarity for XML schema matching
    Jeong, Buhwan
    Lee, Damon
    Cho, Hyunbo
    Lee, Jaewook
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2008, 34 (03) : 1651 - 1658
  • [5] Applications of corpus-based semantic similarity and word segmentation to database schema matching
    Islam, Aminul
    Inkpen, Diana
    Kiringa, Iluju
    [J]. VLDB JOURNAL, 2008, 17 (05): : 1293 - 1320
  • [6] Applications of corpus-based semantic similarity and word segmentation to database schema matching
    Aminul Islam
    Diana Inkpen
    Iluju Kiringa
    [J]. The VLDB Journal, 2008, 17 : 1293 - 1320
  • [7] Assessing the Performance of a New Semantic Similarity Measure Designed for Schema Matching for Mediation Systems
    Yousfi, Aola
    Hafid Elyazidi, Moulay
    Zellou, Ahmed
    [J]. COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2018, PT I, 2018, 11055 : 64 - 74
  • [8] Matching Web Services Based on Semantic Similarity
    Ji, Xiang
    Li, Yongqing
    Liu, Lei
    Zhang, Rui
    [J]. ADVANCES IN COMPUTER SCIENCE, ENVIRONMENT, ECOINFORMATICS, AND EDUCATION, PT II, 2011, 215 : 598 - 604
  • [9] Aggregation of Similarity Measures in Schema Matching based on Generalized Mean
    Elshwimy, Faten A.
    Algergawy, Alsayed
    Sarhan, Amany
    Sallam, Elsayed A.
    [J]. 2014 IEEE 30TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOPS (ICDEW), 2014, : 74 - 79
  • [10] Semantic Matching Similarity Algorithm Based on Dependency Trees
    He, Kun
    Li, Wei
    Huang, Bo
    [J]. INTERNATIONAL CONFERENCE ON MECHANICAL, ELECTRONIC AND INFORMATION TECHNOLOGY (ICMEIT 2018), 2018, : 540 - 544