Self-spatial join selectivity estimation using fractal concepts

被引:22
|
作者
Belussi, A
Faloutsos, C
机构
[1] Politecn Milan, Dipartimento Elettr & Informaz, I-20133 Milan, Italy
[2] Univ Maryland, Dept Comp Sci, College Pk, MD 20742 USA
[3] Univ Maryland, Syst Res Inst, College Pk, MD 20742 USA
关键词
algorithms; theory; fractal dimension; range query; selectivity estimation; spatial join;
D O I
10.1145/279339.279342
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The problem of selectivity estimation for queries of nontraditional databases is still an open issue. In this article, we examine the problem of selectivity estimation for some types of spatial queries in databases containing real data. We have shown earlier [Faloutsos and Kamel 1994] that real point sets typically have a nonuniform distribution, violating consistently the uniformity and independence assumptions. Moreover, we demonstrated that the theory of fractals can help to describe real point sets. In this article we show how the concept of fractal dimension, i.e., (noninteger) dimension, can lead to the solution for the selectivity estimation problem in spatial databases. Among the infinite family of fractal dimensions, we consider here the Hausdorff fractal dimension D(0) and the "Correlation" fractal dimension D(2). Specifically, we show that (a) the average number of neighbors for a given point set follows a power law, with Da as exponent, and (b) the average number of nonempty range queries follows a power law with E - D(0) as exponent (E is the dimension of the embedding space). We present the formulas to estimate the selectivity for "biased" range queries, for self-spatial joins, and for the average number of nonempty range queries. The result of some experiments on real and synthetic point sets are shown. Our formulas achieve very low relative errors, typically about 10%, versus 40%-100% of the formulas that are based on the uniformity and independence assumptions.
引用
收藏
页码:161 / 201
页数:41
相关论文
共 50 条
  • [1] Analysis of range queries and self-spatial join queries on real region datasets stored using an R-tree
    Proietti, G
    Faloutsos, C
    IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2000, 12 (05) : 751 - 762
  • [2] Spatial join selectivity using power laws
    Faloutsos, C
    Seeger, B
    Traina, A
    Traina, C
    SIGMOD RECORD, 2000, 29 (02) : 177 - 188
  • [3] Parallel Selectivity Estimation for Optimizing Multidimensional Spatial Join Processing on GPUs
    Zhang, Jianting
    You, Simin
    Gruenwald, Le
    2017 IEEE 33RD INTERNATIONAL CONFERENCE ON DATA ENGINEERING (ICDE 2017), 2017, : 1591 - 1598
  • [4] Multi-way spatial join selectivity for the ring join graph
    Min, JK
    Park, HH
    Chung, CW
    INFORMATION AND SOFTWARE TECHNOLOGY, 2005, 47 (12) : 785 - 795
  • [5] Selectivity estimation using compressed spatial information
    JEONG Jae hyuck
    CHI Jeong hee
    RYU Keun ho
    重庆邮电学院学报(自然科学版), 2004, (05) : 156 - 160
  • [6] Cost estimation of spatial join in spatialhadoop
    A. Belussi
    S. Migliorini
    A. Eldawy
    GeoInformatica, 2020, 24 : 1021 - 1059
  • [7] Cost estimation of spatial join in spatialhadoop
    Belussi, A.
    Migliorini, S.
    Eldawy, A.
    GEOINFORMATICA, 2020, 24 (04) : 1021 - 1059
  • [8] Spatial selectivity estimation using compressed histogram information
    Chi, JH
    Kim, SH
    Ryu, KH
    WEB TECHNOLOGIES RESEARCH AND DEVELOPMENT - APWEB 2005, 2005, 3399 : 489 - 494
  • [9] The Generic Annular Bucket Histogram for Estimating the Selectivity of Spatial Selection and Spatial Join
    Cheng Changxiu
    Zhou Chenghu
    Chen Rongguo
    GEO-SPATIAL INFORMATION SCIENCE, 2011, 14 (04) : 262 - 273
  • [10] A Self-Spatial Adaptive Weighting Based U-Net for Image Segmentation
    Cho, Choongsang
    Lee, Young Han
    Park, Jongyoul
    Lee, Sangkeun
    ELECTRONICS, 2021, 10 (03) : 1 - 11