Selectivity estimation without the attribute value independence assumption

被引:0
|
作者
Poosala, V [1 ]
Ioannidis, YE [1 ]
机构
[1] AT&T Bell Labs, Murray Hill, NJ 07974 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The result size of a query that involves multiple attributes from the same relation depends on these attributes' joint data distribution, i.e., the frequencies of ail combinations of attribute values. To simplify the estimation of that size, most commercial systems make the attribute value independence assumption and maintain statistics (typically histograms) on individual attributes only. In reality, this assumption is almost always wrong and the resulting estimations tend to be highly inaccurate. In this paper, we propose two main alternatives to effectively approximate (multi-dimensional) joint data distributions. (a) Using a multi-dimensional histogram, (b) Using the Singular Value Decomposition (SVD) technique from linear algebra. An extensive set of experiments demonstrates the advantages and disadvantages of the two approaches and the benefits of both compared to the independence assumption.
引用
收藏
页码:486 / 495
页数:10
相关论文
共 50 条
  • [11] CONVERGENCE ANALYSIS ON A FAST ITERATIVE PHASE RETRIEVAL ALGORITHM WITHOUT INDEPENDENCE ASSUMPTION
    Li, Gen
    Jiao, Yuchen
    Gu, Yuantao
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 4624 - 4628
  • [12] Conditional independence assumption and appropriate number of stages in dental developmental age estimation
    Sgheiza, Valerie
    FORENSIC SCIENCE INTERNATIONAL, 2022, 330
  • [13] An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption
    Yu, Xiyu
    Liu, Tongliang
    Gong, Mingming
    Batmanghelich, Kayhan
    Tao, Dacheng
    2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 4480 - 4489
  • [14] No creation of value without estimation of value
    Kühlcke, R
    FLEISCHWIRTSCHAFT, 1999, 79 (02): : 3 - 3
  • [15] A statistical criterion for Menstrually Related Migraine (MRM) without an independence-of-attacks assumption
    Barra, Mathias
    Dahl, Fredrik A.
    MacGregor, E. Anne
    Vetvik, Kjersti G.
    CEPHALALGIA, 2017, 37 : 248 - 249
  • [16] Comparison of complex-DFT estimators with and without the independence assumption of real and imaginary parts
    Hendriks, Richard C.
    Erkelens, Jan S.
    Heusdens, Richard
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 4033 - 4036
  • [17] The harmonic mean p-value: Strong versus weak control, and the assumption of independence
    Goeman, Jelle J.
    Rosenblatt, Jonathan D.
    Nichols, Thomas E.
    PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2019, 116 (47) : 23382 - 23383
  • [18] Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries
    Hasan, Shohedul
    Thirumuruganathan, Saravanan
    Augustine, Jees
    Koudas, Nick
    Das, Gautam
    SIGMOD'20: PROCEEDINGS OF THE 2020 ACM SIGMOD INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2020, : 1035 - 1050
  • [19] Missing value estimation based on dynamic attribute selection
    Lee, KC
    Park, JS
    Kim, YS
    Byun, YT
    KNOWLEDGE DISCOVERY AND DATA MINING, PROCEEDINGS: CURRENT ISSUES AND NEW APPLICATIONS, 2000, 1805 : 134 - 137
  • [20] EFFICIENCY OF THE INDEPENDENCE ASSUMPTION IN THE COMBINATION OF FORECASTS
    BUNN, D
    TOPPING, I
    OPERATIONS RESEARCH LETTERS, 1984, 3 (04) : 173 - 178