Reconciling schemas of disparate data sources: A machine-learning approach

被引:0
|
作者
Doan, AH [1 ]
Domingos, P [1 ]
Halevy, A [1 ]
机构
[1] Univ Washington, Dept Comp Sci & Engn, Seattle, WA 98195 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
A data-integration system provides access to a multitude of data sources through a single mediated schema. A key bottleneck in building such systems has been the laborious manual construction of semantic mappings between the source schemas and the mediated schema. We describe LSD, a system that employs and extends current machine-learning techniques to semi-automatically find such mappings. LSD first asks the user to provide the semantic mappings for a small set of data sources, then uses these mappings together with the sources to train a set of learners. Each learner exploits a different type of information either in the source schemas or in their data. Once the learners have been trained, LSD finds semantic mappings for a new data source by applying the learners, then combining their predictions using a meta-learner. To further improve matching accuracy, we extend machine learning techniques so that LSD can incorporate domain constraints as:an additional source of knowledge, and develop a novel learner that utilizes the structural information in XML documents. Our approach thus is distinguished in that it incorporates multiple types of knowledge. Importantly, its architecture is extensible to additional learners that may exploit new kinds of information. We describe a set of experiments on several real-world domains, and show that LSD proposes semantic mappings with a high degree of accuracy.
引用
收藏
页码:509 / 520
页数:12
相关论文
共 50 条
  • [41] Road-Deterioration Detection using Road Vibration Data with Machine-Learning Approach
    Takanashi, Masaki
    Ishii, Yoshinao
    Sato, Shu-ichi
    Sano, Noriyoshi
    Sanda, Katsushi
    2020 IEEE INTERNATIONAL CONFERENCE ON PROGNOSTICS AND HEALTH MANAGEMENT (ICPHM), 2020,
  • [42] Compressive-sensing data reconstruction for structural health monitoring: a machine-learning approach
    Bao, Yuequan
    Tang, Zhiyi
    Li, Hui
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2020, 19 (01): : 293 - 304
  • [43] A machine-learning approach to thunderstorm forecasting through post-processing of simulation data
    Vahid Yousefnia, Kianusch
    Boelle, Tobias
    Zoebisch, Isabella
    Gerz, Thomas
    QUARTERLY JOURNAL OF THE ROYAL METEOROLOGICAL SOCIETY, 2024, : 3495 - 3510
  • [44] Machine-learning approach for quantified resolvability enhancement of low-dose STEM data
    Gambini, Laura
    Mullarkey, Tiarnan
    Jones, Lewys
    Sanvito, Stefano
    MACHINE LEARNING-SCIENCE AND TECHNOLOGY, 2023, 4 (01):
  • [45] Complementing subjective with objective data in analysing expertise: A machine-learning approach applied to badminton
    Dieu, Olivier
    Schnitzler, Christophe
    Llena, Clement
    Potdevin, Francois
    JOURNAL OF SPORTS SCIENCES, 2020, 38 (17) : 1943 - 1952
  • [46] Integrating data from disparate sources: A mass collaboration approach
    McCann, R
    Kramnik, A
    Shen, W
    Varadarajan, V
    Sobulo, O
    Doan, A
    ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 487 - 488
  • [47] An Automated Machine-Learning Approach for Road Pothole Detection Using Smartphone Sensor Data
    Wu, Chao
    Wang, Zhen
    Hu, Simon
    Lepine, Julien
    Na, Xiaoxiang
    Ainalis, Daniel
    Stettler, Marc
    SENSORS, 2020, 20 (19) : 1 - 23
  • [48] Differential diagnosis of Parkinsonian Syndromes - combining clinical and imaging data in a machine-learning approach
    Meindl, T.
    Li, Y.
    Jochim, A.
    Mantel, T.
    Hapfelmeier, A.
    Haslinger, B.
    MOVEMENT DISORDERS, 2019, 34 : S813 - S813
  • [49] A machine-learning approach to map landscape connectivity in Aedes aegypti with genetic and environmental data
    Pless, Evlyn
    Saarman, Norah P.
    Powell, Jeffrey R.
    Caccone, Adalgisa
    Amatulli, Giuseppe
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (09)
  • [50] Recognition of Car Front Facing Style for Machine-Learning Data Annotation: A Quantitative Approach
    Ma, Lisha
    Wu, Yu
    Li, Qingnan
    Yuan, Xiaofang
    SYMMETRY-BASEL, 2022, 14 (06):