The Nu Expression for Probabilistic Data Integration

被引:0
|
作者
Evgenia I. Polyakova
Andre G. Journel
机构
[1] Stanford University,Department of Geological and Environmental Sciences
来源
Mathematical Geology | 2007年 / 39卷
关键词
Data integration; Data interaction vs. dependence; Updating probabilities; Conditional independence;
D O I
暂无
中图分类号
学科分类号
摘要
The general problem of data integration is expressed as that of combining probability distributions conditioned to each individual datum or data event into a posterior probability for the unknown conditioned jointly to all data. Any such combination of information requires taking into account data interaction for the specific event being assessed. The nu expression provides an exact analytical representation of such a combination. This representation allows a clear and useful separation of the two components of any data integration algorithm: individual data information content and data interaction, the latter being different from data dependence. Any estimation workflow that fails to address data interaction is not only suboptimal, but may result in severe bias. The nu expression reduces the possibly very complex joint data interaction to a single multiplicative correction parameter ν0, difficult to evaluate but whose exact analytical expression is given; availability of such an expression provides avenues for its determination or approximation. The case ν0=1 is more comprehensive than data conditional independence; it delivers a preliminary robust approximation in presence of actual data interaction. An experiment where the exact results are known allows the results of the ν0=1 approximation to be checked against the traditional estimators based on assumption of data independence.
引用
收藏
页码:715 / 733
页数:18
相关论文
共 50 条
  • [1] The nu expression for probabilistic data integration
    Polyakova, Evgenia I.
    Journel, Andre G.
    [J]. MATHEMATICAL GEOLOGY, 2007, 39 (08): : 715 - 733
  • [2] A probabilistic XML approach to data integration
    van Keulen, M
    de Keijzer, A
    Alink, W
    [J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 459 - 470
  • [3] Using probabilistic information in data integration
    Florescu, D
    Koller, D
    Levy, A
    [J]. PROCEEDINGS OF THE TWENTY-THIRD INTERNATIONAL CONFERENCE ON VERY LARGE DATABASES, 1997, : 216 - 225
  • [4] Πgora: An Integration System for Probabilistic Data
    Olteanu, Dan
    Papageorgiou, Lampros
    van Schaik, Sebastiaan J.
    [J]. 2013 IEEE 29TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE), 2013, : 1324 - 1327
  • [5] A probabilistic XML approach to data integration
    [J]. Van Keulen, M. (m.vankeulen@utwente.nl), IEEE Computer Society; The Database Society of Japan, DBSJ; Information Processing Society of Japan, IPSJ; Institute of Electronics, Info. Commun. Engineers, IEICE (Institute of Electrical and Electronics Engineers Computer Society):
  • [6] Reconciling inconsistent data in probabilistic XML data integration
    Pankowski, Tadeusz
    [J]. SHARING DATA, INFORMATION AND KNOWLEDGE, PROCEEDINGS, 2008, 5071 : 75 - 86
  • [7] An Approach to Probabilistic Data Integration for the Semantic Web
    Cali, Andrea
    Lukasiewicz, Thomas
    [J]. UNCERTAINTY REASONING FOR THE SEMANTIC WEB I, 2008, 5327 : 52 - +
  • [8] An adaptive probabilistic reasoning approach to data integration
    Khreisat, L
    [J]. DMIN '05: Proceedings of the 2005 International Conference on Data Mining, 2005, : 132 - 138
  • [9] Probabilistic Inference of Biological Networks via Data Integration
    Rogers, Mark F.
    Campbell, Colin
    Ying, Yiming
    [J]. BIOMED RESEARCH INTERNATIONAL, 2015, 2015
  • [10] Uncertainty in Data Integration Systems: Automatic Generation of Probabilistic Relationships
    Bergamaschi, Sonia
    Po, Laura
    Sorrentino, Serena
    Corni, Alberto
    [J]. MANAGEMENT OF THE INTERCONNECTED WORLD, 2010, : 221 - 228