Representing uncertain data: models, properties, and algorithms

被引:0
|
作者
Anish Das Sarma
Omar Benjelloun
Alon Halevy
Shubha Nabar
Jennifer Widom
机构
[1] Stanford University,
[2] Google Inc.,undefined
[3] Microsoft Corp,undefined
来源
The VLDB Journal | 2009年 / 18卷
关键词
Uncertain data; Data modeling; Uncertainty;
D O I
暂无
中图分类号
学科分类号
摘要
In general terms, an uncertain relation encodes a set of possible certain relations. There are many ways to represent uncertainty, ranging from alternative values for attributes to rich constraint languages. Among the possible models for uncertain data, there is a tension between simple and intuitive models, which tend to be incomplete, and complete models, which tend to be nonintuitive and more complex than necessary for many applications. We present a space of models for representing uncertain data based on a variety of uncertainty constructs and tuple-existence constraints. We explore a number of properties and results for these models. We study completeness of the models, as well as closure under relational operations, and we give results relating closure and completeness. We then examine whether different models guarantee unique representations of uncertain data, and for those models that do not, we provide complexity results and algorithms for testing equivalence of representations. The next problem we consider is that of minimizing the size of representation of models, showing that minimizing the number of tuples also minimizes the size of constraints. We show that minimization is intractable in general and study the more restricted problem of maintaining minimality incrementally when performing operations. Finally, we present several results on the problem of approximating uncertain data in an insufficiently expressive model.
引用
收藏
页码:989 / 1019
页数:30
相关论文
共 50 条
  • [31] Fast Algorithms for Frequent Itemset Mining from Uncertain Data
    Leung, Carson Kai-Sang
    MacKinnon, Richard Kyle
    Tanbeer, Syed K.
    2014 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2014, : 893 - 898
  • [32] Discovering Process Models from Uncertain Event Data
    Pegoraro, Marco
    Uysal, Merih Seran
    van der Aalst, Wil M. P.
    BUSINESS PROCESS MANAGEMENT WORKSHOPS (BPM 2019), 2019, 362 : 238 - 249
  • [33] CERTAIN MODELS FROM UNCERTAIN DATA - THE ALGEBRAIC CASE
    GUIDORZI, RP
    SYSTEMS & CONTROL LETTERS, 1991, 17 (06) : 415 - 424
  • [34] Learning Gaussian Process Models from Uncertain Data
    Dallaire, Patrick
    Besse, Camille
    Chaib-draa, Brahim
    NEURAL INFORMATION PROCESSING, PT 1, PROCEEDINGS, 2009, 5863 : 433 - 440
  • [35] Mathematical Models for Logistics Network Optimization with Uncertain Data
    Peng, Jin
    2019 INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY AND COMPUTER COMMUNICATIONS (ITCC 2019), 2019, : 93 - 100
  • [36] Efficient and Progressive Algorithms for Distributed Skyline Queries over Uncertain Data
    Ding, Xiaofeng
    Jin, Hai
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (08) : 1448 - 1462
  • [37] Efficient and Progressive Algorithms for Distributed Skyline Queries over Uncertain Data
    Ding, Xiaofeng
    Jin, Hai
    2010 INTERNATIONAL CONFERENCE ON DISTRIBUTED COMPUTING SYSTEMS ICDCS 2010, 2010,
  • [38] Algorithms for the Test of Independence of Two Categorical Variables over Uncertain Data
    Kooakachai, Monchai
    THAI JOURNAL OF MATHEMATICS, 2022, : 62 - 74
  • [39] REPRESENTING, COMBINING AND USING UNCERTAIN ESTIMATES.
    Hamburger, Henry
    1986, 4 : 399 - 414
  • [40] A comparison of methods for representing the meaning of uncertain evidence
    Groen, FJ
    Mosleh, A
    PROBABILISTIC SAFETY ASSESSMENT AND MANAGEMENT, VOL I AND II, PROCEEDINGS, 2002, : 207 - 213