Selectivity estimation without the attribute value independence assumption

被引:0
|
作者
Poosala, V [1 ]
Ioannidis, YE [1 ]
机构
[1] AT&T Bell Labs, Murray Hill, NJ 07974 USA
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The result size of a query that involves multiple attributes from the same relation depends on these attributes' joint data distribution, i.e., the frequencies of ail combinations of attribute values. To simplify the estimation of that size, most commercial systems make the attribute value independence assumption and maintain statistics (typically histograms) on individual attributes only. In reality, this assumption is almost always wrong and the resulting estimations tend to be highly inaccurate. In this paper, we propose two main alternatives to effectively approximate (multi-dimensional) joint data distributions. (a) Using a multi-dimensional histogram, (b) Using the Singular Value Decomposition (SVD) technique from linear algebra. An extensive set of experiments demonstrates the advantages and disadvantages of the two approaches and the benefits of both compared to the independence assumption.
引用
收藏
页码:486 / 495
页数:10
相关论文
共 50 条
  • [1] Alleviating Naive Bayes Attribute Independence Assumption by Attribute Weighting
    Zaidi, Nayyar A.
    Cerquides, Jesus
    Carman, Mark J.
    Webb, Geoffrey I.
    JOURNAL OF MACHINE LEARNING RESEARCH, 2013, 14 : 1947 - 1988
  • [2] Alleviating Naive Bayes attribute independence assumption by attribute weighting
    Zaidi, Nayyar A.
    J., Cerquides
    M.J., Carman
    G.I., Webb
    Journal of Machine Learning Research, 2013, 14 : 1947 - 1988
  • [3] Lightweight Graphical Models for Selectivity Estimation Without Independence Assumptions
    Tzoumas, Kostas
    Deshpande, Amol
    Jensen, Christian S.
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2011, 4 (11): : 852 - 863
  • [4] A Modified Short and Fukunaga Metric based on the attribute independence assumption
    Li, Chaoqun
    Li, Hongwei
    PATTERN RECOGNITION LETTERS, 2012, 33 (09) : 1213 - 1218
  • [5] A neural network diagnosis model without disorder independence assumption
    Xu, Y
    Zhang, CQ
    PRICAI'98: TOPICS IN ARTIFICIAL INTELLIGENCE, 1998, 1531 : 341 - 352
  • [6] Belief functions combination without the assumption of independence of the information sources
    Cattaneo, Marco E. G. V.
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2011, 52 (03) : 299 - 315
  • [7] Evaluating accuracy of diagnostic tests without conditional independence assumption
    Lu, Di
    Zhou, Chunxiao
    Tang, Larry
    Tan, Ming
    Yuan, Ao
    Chan, Leighton
    STATISTICS IN MEDICINE, 2018, 37 (19) : 2809 - 2821
  • [8] Spatial query estimation without the local uniformity assumption
    Tao, Yufei
    Faloutsos, Christos
    Papadias, Dimitris
    GEOINFORMATICA, 2006, 10 (03) : 261 - 293
  • [9] Spatial Query Estimation without the Local Uniformity Assumption
    Yufei Tao
    Christos Faloutsos
    Dimitris Papadias
    GeoInformatica, 2006, 10 : 261 - 293
  • [10] The independence assumption:: Analyzing the independence of the components by topography
    Hyvärinen, A
    Hoyer, PO
    Inki, M
    ADVANCES IN INDEPENDENT COMPONENT ANALYSIS, 2000, : 45 - 62