Reconciling inconsistent data in probabilistic XML data integration

被引:0
|
作者
Pankowski, Tadeusz [1 ]
机构
[1] Poznan Univ Tech, Inst Control & Informat Engn, Poznan, Poland
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of dealing with inconsistent data while integrating XML data from different sources is an important task, necessary to improve data integration quality. Typically, in order to remove inconsistencies, i.e. conflicts between data, data cleaning (or repairing) procedures are applied. In this paper, we present a probabilistic XML data integration setting. A probability is assigned to each data source and its probability models the reliability level of the data source. In this way, an answer (a tuple of values of XML trees) has a probability assigned to it. The problem is how to compute such probability, especially when the same answer is produced by many sources. We consider three semantics for computing such probabilistic answers: by-peer, by-sequence, and by-subtree semantics. The probabilistic answers can be used for resolving a class of inconsistencies violating XML functional dependencies defined over the target schema. Having a probability distribution over a set of conflicting answers, we can choose the one for which the probability of being correct is the highest.
引用
收藏
页码:75 / 86
页数:12
相关论文
共 50 条
  • [21] XML processing and data integration with XQuery
    Robie, Jonathan
    IEEE INTERNET COMPUTING, 2007, 11 (04) : 62 - 67
  • [22] The Nimble XML data integration system
    Draper, D
    HaLevy, AY
    Weld, DS
    17TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2001, : 155 - 160
  • [23] XML restructuring and integration for tabular data
    Yu, Wei
    Ozsoyoglu, Z. Meral
    Ozsoyoglu, Gultekin
    Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2003, 2736 : 233 - 243
  • [24] XML data integration by graph restructuring
    Zamboulis, L
    KEY TECHNOLOGIES FOR DATA MANAGEMENT, 2004, 3112 : 57 - 71
  • [25] DeweyTP: A labeling scheme for probabilistic XML data
    Chen, Zi-Yang
    Liu, Jia
    Zhang, Liu-Hui
    Zhou, Jun-Feng
    Tongxin Xuebao/Journal on Communications, 2013, 34 (11): : 26 - 32
  • [26] Keyword Search over Probabilistic XML Data
    Zhao, Yue
    Wang, Guoren
    Yuan, Ye
    Wang, Junxia
    Lin, Chungang
    Yu, Ying
    2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1230 - 1235
  • [27] Nearest Keyword Search on Probabilistic XML Data
    Zhao, Yue
    Yuan, Ye
    Wang, Guoren
    WEB TECHNOLOGIES AND APPLICATIONS, APWEB 2014, 2014, 8709 : 485 - 493
  • [28] A further step for efficient corrections of inconsistent probabilistic data sets
    Baioletti, Marco
    Capotorti, Andrea
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2023, 159
  • [29] Bottom-up parameterization of enzyme rate constants: Reconciling inconsistent data
    Zielinski, Daniel C.
    Matos, Marta R. A.
    de Bree, James E.
    Glass, Kevin
    Sonnenschein, Nikolaus
    Palsson, Bernhard O.
    METABOLIC ENGINEERING COMMUNICATIONS, 2024, 18
  • [30] Heterogeneous Data Integration Using of XML and PHP
    Geng, Yushui
    Kong, Xiangcui
    Guo, Aizhang
    PROCEEDINGS OF 2008 INTERNATIONAL PRE-OLYMPIC CONGRESS ON COMPUTER SCIENCE, VOL I: COMPUTER SCIENCE AND ENGINEERING, 2008, : 116 - 119