Multi-dimensional histograms with tight bounds for the error

被引:0
|
作者
Baltrunas, Linas
Mazeika, Arturas
Bohlen, Michael
机构
来源
10TH INTERNATIONAL DATABASE ENGINEERING AND APPLICATIONS SYMPOSIUM, PROCEEDINGS | 2006年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Histograms are being used as non-parametric selectivity estimators for one-dimensional data. For highdimensional data it is common to either compute one-dimensional histograms for each attribute or to compute a multi-dimensional equi-width histogram for a set of attributes. This either yields small low-quality or large high-quality histograms. In this paper we introduce HIRED (HIgh-dimensional histograms with dimensionality REDuction): small high-quality histograms for multi-dimensional data. H I RED histograms are adaptive, and they are based on the shape error and directional splits. The shape error permits a precise control of the estimation error of the histogram and, together with directional splits, yields a memory complexity that does not depend on the number of uniform attributes in the dataset. We provide extensive experimental results with synthetic and real world datasets. The experiments confirm that our method is as precise as state-of-the-art techniques and uses orders of magnitude less memory.
引用
收藏
页码:105 / 112
页数:8
相关论文
共 50 条
  • [2] New bounds for multi-dimensional packing
    Seiden, SS
    van Stee, R
    PROCEEDINGS OF THE THIRTEENTH ANNUAL ACM-SIAM SYMPOSIUM ON DISCRETE ALGORITHMS, 2002, : 486 - 495
  • [3] Clustering-based histograms for multi-dimensional data
    Furfaro, F
    Mazzeo, GM
    Sirangelo, C
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2005, 3589 : 478 - 487
  • [4] Segmentation of multi-dimensional infrared imagery from histograms
    Silverman, J
    Rotman, SR
    Caefer, CE
    INFRARED PHYSICS & TECHNOLOGY, 2004, 45 (03) : 191 - 200
  • [5] Multi-dimensional color histograms for segmentation of wounds in images
    Kolesnik, M
    Fexa, A
    IMAGE ANALYSIS AND RECOGNITION, 2005, 3656 : 1014 - 1022
  • [6] Tight Bounds for Differentially Private Anonymized Histograms
    Manurangsi, Pasin
    2022 SYMPOSIUM ON SIMPLICITY IN ALGORITHMS, SOSA, 2022, : 203 - 213
  • [7] An efficient distance between multi-dimensional histograms for comparing images
    Serratosa, Francesc
    Sanroma, Gerard
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, PROCEEDINGS, 2006, 4109 : 412 - 421
  • [8] Compressed hierarchical binary histograms for summarizing multi-dimensional data
    Filippo Furfaro
    Giuseppe M. Mazzeo
    Domenico Saccà
    Cristina Sirangelo
    Knowledge and Information Systems, 2008, 15 : 335 - 380
  • [9] Compressed hierarchical binary histograms for summarizing multi-dimensional data
    Furfaro, Filippo
    Mazzeo, Giuseppe M.
    Sacca, Domenico
    Sirangelo, Cristina
    KNOWLEDGE AND INFORMATION SYSTEMS, 2008, 15 (03) : 335 - 380
  • [10] A new algorithm to compute the distance between multi-dimensional histograms
    Serratosa, Francesc
    Sanroma, Gerard
    Sanfeliu, Alberto
    PROGRESS IN PATTERN RECOGNITION, IMAGE ANALYSIS AND APPLICATIONS, PROCEEDINGS, 2007, 4756 : 115 - +