U-Skyline: A New Skyline Query for Uncertain Databases

被引:37
|
作者
Liu, Xingjie [1 ]
Yang, De-Nian [2 ,3 ]
Ye, Mao [4 ]
Lee, Wang-Chien [5 ]
机构
[1] Penn State Univ, Mountain View, CA 94043 USA
[2] Acad Sinica, Inst Informat Sci, Taipei 11529, Taiwan
[3] Acad Sinica, Res Ctr Informat Technol Innovat, Taipei 11529, Taiwan
[4] Klout, San Francisco, CA 94107 USA
[5] Penn State Univ, Dept Comp Sci & Engn, University Pk, PA 16802 USA
关键词
Skyline query; uncertain databases; query processing;
D O I
10.1109/TKDE.2012.33
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The skyline query, aiming at identifying a set of skyline tuples that are not dominated by any other tuple, is particularly useful for multicriteria data analysis and decision making. For uncertain databases, a probabilistic skyline query, called P-Skyline, has been developed to return skyline tuples by specifying a probability threshold. However, the answer obtained via a P-Skyline query usually includes skyline tuples undesirably dominating each other when a small threshold is specified; or it may contain much fewer skyline tuples if a larger threshold is employed. To address this concern, we propose a new uncertain skyline query, called U-Skyline query, in this paper. Instead of setting a probabilistic threshold to qualify each skyline tuple independently, the U-Skyline query searches for a set of tuples that has the highest probability (aggregated from all possible scenarios) as the skyline answer. In order to answer U-Skyline queries efficiently, we propose a number of optimization techniques for query processing, including 1) computational simplification of U-Skyline probability, 2) pruning of unqualified candidate skylines and early termination of query processing, 3) reduction of the input data set, and 4) partition and conquest of the reduced data set. We perform a comprehensive performance evaluation on our algorithm and an alternative approach that formulates the U-Skyline processing problem by integer programming. Experimental results demonstrate that our algorithm is 10-100 times faster than using CPLEX, a parallel integer programming solver, to answer the U-Skyline query.
引用
收藏
页码:945 / 960
页数:16
相关论文
共 50 条
  • [1] A Framework for Evaluating Skyline Query over Uncertain Autonomous Databases
    Saad, Nurul Husna Mohd
    Ibrahim, Hamidah
    Alwan, Ali Amer
    Sidi, Fatimah
    Yaakob, Razali
    [J]. 2014 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE, 2014, 29 : 1546 - 1556
  • [2] Skyline ranking for uncertain databases
    Yong, Hyountaek
    Lee, Jongwuk
    Kim, Jinha
    Hwang, Seung-won
    [J]. INFORMATION SCIENCES, 2014, 273 : 247 - 262
  • [3] Uncertain Dynamic Skyline Queries for Uncertain Databases
    Yang, ZhiBang
    Yang, XiaoNiu
    Zhou, Xu
    [J]. 2015 12TH INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS AND KNOWLEDGE DISCOVERY (FSKD), 2015, : 1797 - 1802
  • [4] Reverse Skyline Search in Uncertain Databases
    Lian, Xiang
    Chen, Lei
    [J]. ACM TRANSACTIONS ON DATABASE SYSTEMS, 2010, 35 (01):
  • [5] Subarray Skyline Query Processing in Array Databases
    Choi, Dalsu
    Yoon, Hyunsik
    Chung, Yon Dohn
    [J]. 33RD INTERNATIONAL CONFERENCE ON SCIENTIFIC AND STATISTICAL DATABASE MANAGEMENT (SSDBM 2021), 2020, : 37 - 48
  • [6] Skyline Probabilities with Range Query on Uncertain Dimensions
    Saad, Nurul Husna Mohd
    Ibrahim, Hamidah
    Sidi, Fatimah
    Yaakob, Razali
    [J]. ADVANCES IN COMPUTER COMMUNICATION AND COMPUTATIONAL SCIENCES, IC4S 2018, 2019, 924 : 225 - 242
  • [7] Computing Range Skyline Query on Uncertain Dimension
    Saad, Nurul Husna Mohd
    Ibrahim, Hamidah
    Sidi, Fatimah
    Yaakob, Razali
    Alwan, Ali Amer
    [J]. DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2016, PT II, 2016, 9828 : 377 - 388
  • [8] Bounds on skyline probability for databases with uncertain preferences
    Pujari, Arun K.
    Padmanabhan, Vineet
    Kagita, Venkateswara Rao
    [J]. INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2017, 80 : 199 - 213
  • [9] Computing Exact Skyline Probabilities for Uncertain Databases
    Kim, Dongwon
    Im, Hyeonseung
    Park, Sungwoo
    [J]. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (12) : 2113 - 2126
  • [10] Skyline-join query processing in distributed databases
    Bai, Mei
    Xin, Junchang
    Wang, Guoren
    Zimmermann, Roger
    Wang, Xite
    [J]. FRONTIERS OF COMPUTER SCIENCE, 2016, 10 (02) : 330 - 352