Reporting l most influential objects in uncertain databases based on probabilistic reverse top-k queries

被引:51
|
作者
Xiao, Guoqing [1 ,2 ]
Li, Kenli [1 ,2 ]
Li, Keqin [1 ,2 ,3 ]
机构
[1] Hunan Univ, Coll Informat Sci & Engn, Changsha 410082, Hunan, Peoples R China
[2] Natl Supercomputing Ctr Changsha, Changsha 410082, Hunan, Peoples R China
[3] SUNY Coll New Paltz, Dept Comp Sci, New Paltz, NY 12561 USA
基金
中国国家自然科学基金; 对外科技合作项目(国际科技项目);
关键词
Data management; Probabilistic reverse top-k queries; Probabilistic skyline queries; Probabilistic top-l influential queries; Uncertain databases; EFFICIENT; PRODUCTS;
D O I
10.1016/j.ins.2017.04.028
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reverse top-k queries are proposed from the perspective of a product manufacturer, which are essential for manufacturers to assess the potential market. However, the existing approaches for reverse top-k queries are all based on the assumption that the underlying data are exact (or certain). Due to the intrinsic differences between uncertain and certain data, these methods cannot be applied to process uncertain data sets directly. Motivated by this, in this paper, we firstly model the probabilistic reverse top-k queries over uncertain data. Moreover, we formulate a probabilistic top-l influential query, that reports the 1 most influential objects having the largest impact factors, where the impact factor of an object is defined as the cardinality of its probabilistic reverse top-k query result set. We present effective pruning heuristics for speeding up the queries. Particularly, we exploit several properties of probabilistic threshold top-k queries and probabilistic skyline queries to reduce the search space of this problem. In addition, an upper bound of the potential users is estimated to reduce the cost of computing the probabilistic reverse top-k queries for the candidate objects. Finally, efficient query algorithms are presented seamlessly with integration of the proposed pruning strategies. Extensive experiments using both real-world and synthetic data sets demonstrate the efficiency and effectiveness of our proposed algorithms. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:207 / 226
页数:20
相关论文
共 50 条
  • [1] Reporting l Most Favorite Objects in Uncertain Databases with Probabilistic Reverse Top-k Queries
    Xiao, Guoqing
    Li, Kenli
    Li, Keqin
    [J]. 2015 IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOP (ICDMW), 2015, : 1592 - 1599
  • [2] Identifying the Most Influential Data Objects with Reverse Top-k Queries
    Vlachou, Akrivi
    Doulkeridis, Christos
    Norvag, Kjetil
    Kotidis, Yannis
    [J]. PROCEEDINGS OF THE VLDB ENDOWMENT, 2010, 3 (01): : 364 - 372
  • [3] Probabilistic top-k dominating queries in uncertain databases
    Lian, Xiang
    Chen, Lei
    [J]. INFORMATION SCIENCES, 2013, 226 : 23 - 46
  • [4] Distributed probabilistic top-k dominating queries over uncertain databases
    Niranjan Rai
    Xiang Lian
    [J]. Knowledge and Information Systems, 2023, 65 : 4939 - 4965
  • [5] Distributed probabilistic top-k dominating queries over uncertain databases
    Rai, Niranjan
    Lian, Xiang
    [J]. KNOWLEDGE AND INFORMATION SYSTEMS, 2023, 65 (11) : 4939 - 4965
  • [6] Probabilistic Reverse Top-k Queries
    Jin, Cheqing
    Zhang, Rong
    Kang, Qiangqiang
    Zhang, Zhao
    Zhou, Aoying
    [J]. DATABASE SYSTEMS FOR ADVANCED APPLICATIONS, DASFAA 2014, PT I, 2014, 8421 : 406 - 419
  • [7] Efficient processing of top-k queries in uncertain databases
    Yi, Ke
    Li, Feifei
    Kollios, George
    Srivastava, Divesh
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, VOLS 1-3, 2008, : 1406 - +
  • [8] Semantics and evaluation of top-k queries in probabilistic databases
    Zhang, Xi
    Chomicki, Jan
    [J]. DISTRIBUTED AND PARALLEL DATABASES, 2009, 26 (01) : 67 - 126
  • [9] On the semantics and evaluation of top-k queries in probabilistic databases
    Zhang, Xi
    Chomicki, Jan
    [J]. 2008 IEEE 24TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING WORKSHOP, VOLS 1 AND 2, 2008, : 231 - 238
  • [10] Semantics and evaluation of top-k queries in probabilistic databases
    Xi Zhang
    Jan Chomicki
    [J]. Distributed and Parallel Databases, 2009, 26 : 67 - 126