Ranking Query Answers in Probabilistic Databases: Complexity and Efficient Algorithms

被引:10
|
作者
Olteanu, Dan [1 ]
Wen, Hongkai [1 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/ICDE.2012.61
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In many applications of probabilistic databases, the probabilities are mere degrees of uncertainty in the data and are not otherwise meaningful to the user. Often, users care only about the ranking of answers in decreasing order of their probabilities or about a few most likely answers. In this paper, we investigate the problem of ranking query answers in probabilistic databases. We give a dichotomy for ranking in case of conjunctive queries without repeating relation symbols: it is either in polynomial time or #P-hard. Surprisingly, our syntactic characterisation of tractable queries is not the same as for probability computation. The key observation is that there are queries for which probability computation is #P-hard, yet ranking can be computed in polynomial time. This is possible whenever probability computation for distinct answers has a common factor that is hard to compute but irrelevant for ranking. We complement this tractability analysis with an effective ranking technique for conjunctive queries. Given a query, we construct a share plan, which exposes subqueries whose probability computation can be shared or ignored across query answers. Our technique combines share plans with incremental approximate probability computation of subqueries. We implemented our technique in the SPROUT query engine and report on performance gains of orders of magnitude over Monte Carlo simulation using FPRAS and exact probability computation based on knowledge compilation.
引用
收藏
页码:282 / 293
页数:12
相关论文
共 50 条
  • [41] Approximate Probabilistic Query Answering over Inconsistent Databases
    Greco, Sergio
    Molinaro, Cristian
    [J]. Conceptual Modeling - ER 2008, Proceedings, 2008, 5231 : 311 - 325
  • [42] Fuzzy Query Results Ranking over Autonomous Web Databases
    Meng, Xiangfu
    Ma, Z. A.
    [J]. 2008 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), VOLS 1-6, 2008, : 3231 - 3236
  • [43] Similarity-based ranking and query processing in multimedia databases
    Candan, KS
    Li, WS
    Priya, ML
    [J]. DATA & KNOWLEDGE ENGINEERING, 2000, 35 (03) : 259 - 298
  • [44] The complexity of query evaluation in indefinite temporal constraint databases
    Koubarakis, M
    [J]. THEORETICAL COMPUTER SCIENCE, 1997, 171 (1-2) : 25 - 60
  • [45] Efficient algorithms for ranking with SVMs
    Chapelle, O.
    Keerthi, S. S.
    [J]. INFORMATION RETRIEVAL, 2010, 13 (03): : 201 - 215
  • [46] Efficient algorithms for ranking with SVMs
    O. Chapelle
    S. S. Keerthi
    [J]. Information Retrieval, 2010, 13 : 201 - 215
  • [47] Efficient algorithms for local ranking
    Chang, Chia-Jung
    Chao, Kun-Mao
    [J]. INFORMATION PROCESSING LETTERS, 2012, 112 (13) : 517 - 522
  • [48] Optimizing the Computation of Approximate Certain Query Answers over Incomplete Databases
    Fiorentino, Nicola
    Molinar, Cristian
    Trubitsyna, Irina
    [J]. FLEXIBLE QUERY ANSWERING SYSTEMS, 2019, 11529 : 48 - 60
  • [49] Identifying the Extent of Completeness of Query Answers over Partially Complete Databases
    Razniewski, Simon
    Korn, Flip
    Nutt, Werner
    Srivastava, Divesh
    [J]. SIGMOD'15: PROCEEDINGS OF THE 2015 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2015, : 561 - 576
  • [50] SUPERFINITENESS OF QUERY ANSWERS IN DEDUCTIVE DATABASES - AN AUTOMATA-THEORETIC APPROACH
    LAKSHMANAN, LVS
    NONEN, DA
    [J]. LECTURE NOTES IN COMPUTER SCIENCE, 1992, 652 : 176 - 190