Reconciling inconsistent data in probabilistic XML data integration

被引:0
|
作者
Pankowski, Tadeusz [1 ]
机构
[1] Poznan Univ Tech, Inst Control & Informat Engn, Poznan, Poland
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of dealing with inconsistent data while integrating XML data from different sources is an important task, necessary to improve data integration quality. Typically, in order to remove inconsistencies, i.e. conflicts between data, data cleaning (or repairing) procedures are applied. In this paper, we present a probabilistic XML data integration setting. A probability is assigned to each data source and its probability models the reliability level of the data source. In this way, an answer (a tuple of values of XML trees) has a probability assigned to it. The problem is how to compute such probability, especially when the same answer is produced by many sources. We consider three semantics for computing such probabilistic answers: by-peer, by-sequence, and by-subtree semantics. The probabilistic answers can be used for resolving a class of inconsistencies violating XML functional dependencies defined over the target schema. Having a probability distribution over a set of conflicting answers, we can choose the one for which the probability of being correct is the highest.
引用
收藏
页码:75 / 86
页数:12
相关论文
共 50 条
  • [41] ELCA evaluation for keyword search on probabilistic XML data
    Zhou, Rui
    Liu, Chengfei
    Li, Jianxin
    Yu, Jeffrey Xu
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2013, 16 (02): : 171 - 193
  • [42] Extending XML document projection for data integration
    Peng, XB
    Brazile, R
    Swigger, KM
    Proceedings of the 2005 IEEE International Conference on Information Reuse and Integration, 2005, : 138 - 143
  • [43] ELCA evaluation for keyword search on probabilistic XML data
    Rui Zhou
    Chengfei Liu
    Jianxin Li
    Jeffrey Xu Yu
    World Wide Web, 2013, 16 : 171 - 193
  • [44] Using XML data integration and ontology reuse to share agricultural data
    Sall, Ousmane
    Lo, Moussa
    Gandon, Fabien
    Niang, Cheikh
    Diop, Ibrahima
    International Journal of Metadata, Semantics and Ontologies, 2009, 4 (1-2) : 93 - 105
  • [45] XML data integration in peer-to-peer data management systems
    Pankowski, Tadeusz
    WEBIST 2008: PROCEEDINGS OF THE FOURTH INTERNATIONAL CONFERENCE ON WEB INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2008, : 296 - 300
  • [46] Discovering and reconciling value conflicts for numerical data integration
    Fan, WG
    Lu, HJ
    Madnick, SE
    Cheung, D
    INFORMATION SYSTEMS, 2001, 26 (08) : 635 - 656
  • [47] The nu expression for probabilistic data integration
    Polyakova, Evgenia I.
    Journel, Andre G.
    MATHEMATICAL GEOLOGY, 2007, 39 (08): : 715 - 733
  • [48] Πgora: An Integration System for Probabilistic Data
    Olteanu, Dan
    Papageorgiou, Lampros
    van Schaik, Sebastiaan J.
    2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1324 - 1327
  • [49] Using probabilistic information in data integration
    Florescu, D
    Koller, D
    Levy, A
    PROCEEDINGS OF THE TWENTY-THIRD INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, 1997, : 216 - 225
  • [50] The Nu Expression for Probabilistic Data Integration
    Evgenia I. Polyakova
    Andre G. Journel
    Mathematical Geology, 2007, 39 : 715 - 733