A sampling approach for skyline query cardinality estimation

被引:0
|
作者
Cheng Luo
Zhewei Jiang
Wen-Chi Hou
Shan He
Qiang Zhu
机构
[1] Coppin State University,Department of Mathematics and Computer Science
[2] Frederick Community College,Computer Science Department
[3] Southern Illinois University Carbondale,School of Economics and Management
[4] Southwest Petroleum University,Department of Computer and Information Science
[5] University of Michigan,undefined
来源
关键词
Skyline query; Cardinality estimation; Sampling;
D O I
暂无
中图分类号
学科分类号
摘要
A skyline query returns a set of candidate records that satisfy several preferences. It is an operation commonly performed to aid decision making. Since executing a skyline query is expensive and a query plan may combine skyline queries with other data operations such as join, it is important that the query optimizer can quickly yield an accurate cardinality estimate for a skyline query. Log Sampling (LS) and Kernel-Based ( KB) skyline cardinality estimation are the two state-of-the-art skyline cardinality estimation methods. LS is based on a hypothetical model A(log(n))B. Since this model is originally derived under strong assumptions like data independence between dimensions, it does not apply well to an arbitrary data set. Consequently, LS can yield large estimation errors. KB relies on the integration of the estimated probability density function (PDF) to derive the scale factor Ψds. As the estimation of PDF and the ensuing integration both involve complex mathematical calculations, KB is time consuming. In view of these problems, we propose an innovative purely sampling-based (PS) method for skyline cardinality estimation. PS is non-parametric. It does not assume any particular data distribution and is, thus, more robust than LS. PS does not require complex mathematical calculations. Therefore, it is much simpler to implement and much faster to yield the estimates than KB. Extensive empirical studies show that for a variety of real and synthetic data sets, PS outperforms LS in terms of estimation speed, estimation accuracy, and estimation variability under the same space budget. PS outperforms KB in terms of estimation speed and estimation variability under the same performance mark.
引用
收藏
页码:281 / 301
页数:20
相关论文
共 50 条
  • [1] A sampling approach for skyline query cardinality estimation
    Luo, Cheng
    Jiang, Zhewei
    Hou, Wen-Chi
    He, Shan
    Zhu, Qiang
    KNOWLEDGE AND INFORMATION SYSTEMS, 2012, 32 (02) : 281 - 301
  • [2] A cardinality-tunable skyline query: Fuzzy skyline
    School of Information Science and Engineering, Northeastern University, Shenyang 110004, China
    Dongbei Daxue Xuebao, 2009, 12 (1706-1709):
  • [3] Kernel-Based Skyline Cardinality Estimation
    Zhang, Zhenjie
    Yang, Yin
    Cai, Ruichu
    Papadias, Dimitris
    Tung, Anthony
    ACM SIGMOD/PODS 2009 CONFERENCE, 2009, : 509 - 521
  • [4] Effective skyline cardinality estimation on data streams
    Lu, Yang
    Zhao, Jiakui
    Chen, Lijun
    Cui, Bin
    Yang, Dongqing
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, PROCEEDINGS, 2008, 5181 : 241 - +
  • [5] Cardinality Estimation of Subgraph Matching: A Filtering-Sampling Approach
    Shin, Wonseok
    Song, Siwoo
    Park, Kunsoo
    Han, Wook-Shin
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2024, 17 (07): : 1697 - 1709
  • [6] Tuning the Cardinality of Skyline
    Huang, Jianmei
    Ding, Dabin
    Wang, Guoren
    Xin, Junchang
    ADVANCED WEB AND NETWORK TECHNOLOGIES, AND APPLICATIONS, 2008, 4977 : 220 - 231
  • [7] Cardinality estimation in query for probability RDF graphs
    Zhang, Deng-Yi
    Wu, Wen-Li
    Ouyang, Chu-Fei
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2015, 43 (09): : 1745 - 1749
  • [8] Tuning the cardinality of skyline
    Northeastern University, Shenyang
    110004, China
    Lect. Notes Comput. Sci., 2008, (220-231):
  • [9] Query estimation by adaptive sampling
    Wu, YL
    Agrawal, D
    El Abbadi, A
    18TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 2002, : 639 - 648
  • [10] Effective approach for an extended P-skyline query
    Zhou, Xu
    Zhou, Yantao
    Xiao, Guoqing
    Zeng, Yifu
    Zheng, Fei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2016, 31 (02) : 849 - 858