Ranking Query Answers in Probabilistic Databases: Complexity and Efficient Algorithms

被引:10
|
作者
Olteanu, Dan [1 ]
Wen, Hongkai [1 ]
机构
[1] Univ Oxford, Dept Comp Sci, Oxford OX1 3QD, England
基金
英国工程与自然科学研究理事会;
关键词
D O I
10.1109/ICDE.2012.61
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In many applications of probabilistic databases, the probabilities are mere degrees of uncertainty in the data and are not otherwise meaningful to the user. Often, users care only about the ranking of answers in decreasing order of their probabilities or about a few most likely answers. In this paper, we investigate the problem of ranking query answers in probabilistic databases. We give a dichotomy for ranking in case of conjunctive queries without repeating relation symbols: it is either in polynomial time or #P-hard. Surprisingly, our syntactic characterisation of tractable queries is not the same as for probability computation. The key observation is that there are queries for which probability computation is #P-hard, yet ranking can be computed in polynomial time. This is possible whenever probability computation for distinct answers has a common factor that is hard to compute but irrelevant for ranking. We complement this tractability analysis with an effective ranking technique for conjunctive queries. Given a query, we construct a share plan, which exposes subqueries whose probability computation can be shared or ignored across query answers. Our technique combines share plans with incremental approximate probability computation of subqueries. We implemented our technique in the SPROUT query engine and report on performance gains of orders of magnitude over Monte Carlo simulation using FPRAS and exact probability computation based on knowledge compilation.
引用
收藏
页码:282 / 293
页数:12
相关论文
共 50 条
  • [1] Efficient Probabilistic Query Ranking in Uncertain Databases
    Katukoori, Divya
    Bhima, K.
    Sri, T. Aruna
    Chowdary, S. Hemanth
    Bhattacharya, Sujoy
    [J]. GLOBAL TRENDS IN COMPUTING AND COMMUNICATION SYSTEMS, PT 1, 2012, 269 : 169 - 177
  • [2] Explaining Query Answers in Probabilistic Databases
    Debbi, Hichem
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2023, 8 (04): : 140 - 152
  • [3] Efficient query evaluation on probabilistic databases
    Nilesh Dalvi
    Dan Suciu
    [J]. The VLDB Journal, 2007, 16 : 523 - 544
  • [4] Efficient query evaluation on probabilistic databases
    Dalvi, Nilesh
    Suciu, Dan
    [J]. VLDB JOURNAL, 2007, 16 (04): : 523 - 544
  • [5] Using OBDDs for Efficient Query Evaluation on Probabilistic Databases
    Olteanu, Dan
    Huang, Jiewen
    [J]. SCALABLE UNCERTAINTY MANAGEMENT, SUM 2008, 2008, 5291 : 326 - 340
  • [6] Generating efficient safe query plans for probabilistic databases
    Qin, Biao
    Xia, Yuni
    [J]. DATA & KNOWLEDGE ENGINEERING, 2008, 67 (03) : 485 - 503
  • [7] Computing query answers in databases
    Xie Dong
    Yang Luming
    [J]. ADVANCED COMPUTER TECHNOLOGY, NEW EDUCATION, PROCEEDINGS, 2007, : 503 - 508
  • [8] Efficient inverted lists and query algorithms for structured value ranking in update-intensive relational databases
    Guo, L
    Shanmugasundaram, J
    Beyer, K
    Shekita, E
    [J]. ICDE 2005: 21ST INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2005, : 298 - 309
  • [9] Efficient algorithms for supergraph query processing on graph databases
    Shuo Zhang
    Xiaofeng Gao
    Weili Wu
    Jianzhong Li
    Hong Gao
    [J]. Journal of Combinatorial Optimization, 2011, 21 : 159 - 191
  • [10] Efficient algorithms for supergraph query processing on graph databases
    Zhang, Shuo
    Gao, Xiaofeng
    Wu, Weili
    Li, Jianzhong
    Gao, Hong
    [J]. JOURNAL OF COMBINATORIAL OPTIMIZATION, 2011, 21 (02) : 159 - 191