Ranking queries on uncertain data

被引:14
|
作者
Hua, Ming [1 ]
Pei, Jian [2 ]
Lin, Xuemin [3 ,4 ]
机构
[1] Facebook Inc, Cambridge, MA USA
[2] Simon Fraser Univ, Burnaby, BC V5A 1S6, Canada
[3] Univ New S Wales, Sydney, NSW, Australia
[4] NICTA, Sydney, NSW, Australia
来源
VLDB JOURNAL | 2011年 / 20卷 / 01期
基金
加拿大自然科学与工程研究理事会;
关键词
Uncertain data; Probabilistic ranking queries; Query processing;
D O I
10.1007/s00778-010-0196-4
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Uncertain data is inherent in a few important applications. It is far from trivial to extend ranking queries (also known as top-k queries), a popular type of queries on certain data, to uncertain data. In this paper, we cast ranking queries on uncertain data using three parameters: rank threshold k, probability threshold p, and answer set size threshold l. Systematically, we identify four types of ranking queries on uncertain data. First, a probability threshold top-k query computes the uncertain records taking a probability of at least p to be in the top-k list. Second, a top-(k, l) query returns the top-l uncertain records whose probabilities of being ranked among top-k are the largest. Third, the p-rank of an uncertain record is the smallest number k such that the record takes a probability of at least p to be ranked in the top-k list. A rank threshold top-k query retrieves the records whose p-ranks are at most k. Last, a top-(p, l) query returns the top-l uncertain records with the smallest p-ranks. To answer such ranking queries, we present an efficient exact algorithm, a fast sampling algorithm, and a Poisson approximation-based algorithm. To answer top-(k, l) queries and top-(p, l) queries, we propose PRist+, a compact index. An efficient index construction algorithm and efficacious query answering methods are developed for PRist+. An empirical study using real and synthetic data sets verifies the effectiveness of the probabilistic ranking queries and the efficiency of our methods.
引用
收藏
页码:129 / 153
页数:25
相关论文
共 50 条
  • [1] Ranking queries on uncertain data
    Ming Hua
    Jian Pei
    Xuemin Lin
    [J]. The VLDB Journal, 2011, 20 : 129 - 153
  • [2] Supporting ranking queries on uncertain and incomplete data
    Soliman, Mohamed A.
    Ilyas, Ihab F.
    Ben-David, Shalev
    [J]. VLDB JOURNAL, 2010, 19 (04): : 477 - 501
  • [3] Supporting ranking queries on uncertain and incomplete data
    Mohamed A. Soliman
    Ihab F. Ilyas
    Shalev Ben-David
    [J]. The VLDB Journal, 2010, 19 : 477 - 501
  • [4] Probabilistic Inverse Ranking Queries over Uncertain Data
    Lian, Xiang
    Chen, Lei
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, PROCEEDINGS, 2009, 5463 : 35 - 50
  • [5] Efficient fuzzy ranking queries in uncertain databases
    Xu, Chuanfei
    Wang, Yanqiu
    Gu, Yu
    Lin, Shukuan
    Yu, Ge
    [J]. APPLIED INTELLIGENCE, 2012, 37 (01) : 47 - 59
  • [6] Probabilistic inverse ranking queries in uncertain databases
    Xiang Lian
    Lei Chen
    [J]. The VLDB Journal, 2011, 20 : 107 - 127
  • [7] Probabilistic inverse ranking queries in uncertain databases
    Lian, Xiang
    Chen, Lei
    [J]. VLDB JOURNAL, 2011, 20 (01): : 107 - 127
  • [8] Efficient fuzzy ranking queries in uncertain databases
    Chuanfei Xu
    Yanqiu Wang
    Yu Gu
    Shukuan Lin
    Ge Yu
    [J]. Applied Intelligence, 2012, 37 : 47 - 59
  • [9] Efficient Approximation of Certain and Possible Answers for Ranking and Window Queries over Uncertain Data
    Feng, Su
    Glavic, Boris
    Kennedy, Oliver
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2023, 16 (06): : 1346 - 1358
  • [10] Range queries on uncertain data
    Li, Jian
    Wang, Haitao
    [J]. THEORETICAL COMPUTER SCIENCE, 2016, 609 : 32 - 48