Histograms based on the minimum description length principle

被引:0
|
作者
Hai Wang
Kenneth C. Sevcik
机构
[1] Saint Mary’s University,Sobey School of Business
[2] University of Toronto,Department of Computer Science
来源
The VLDB Journal | 2008年 / 17卷
关键词
Query processing; Approximate query answering; Data summarization; Histograms;
D O I
暂无
中图分类号
学科分类号
摘要
Histograms have been widely used for selectivity estimation in query optimization, as well as for fast approximate query answering in many OLAP, data mining, and data visualization applications. This paper presents a new family of histograms, the Hierarchical Model Fitting (HMF) histograms, based on the Minimum Description Length principle. Rather than having each bucket of a histogram described by the same type of model, the HMF histograms employ a local optimal model for each bucket. The improved effectiveness of the locally chosen models offsets more than the overhead of keeping track of the representation of each individual bucket. Through a set of experiments, we show that the HMF histograms are capable of providing more accurate approximations than previously proposed techniques for many real and synthetic data sets across a variety of query workloads.
引用
收藏
页码:419 / 442
页数:23
相关论文
共 50 条
  • [1] Histograms based on the minimum description length principle
    Wang, Hai
    Sevcik, Kenneth C.
    [J]. VLDB JOURNAL, 2008, 17 (03): : 419 - 442
  • [2] The minimum description length principle for probability density estimation by regular histograms
    Chapeau-Blondeau, Francois
    Rousseau, David
    [J]. PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS, 2009, 388 (18) : 3969 - 3984
  • [3] Introducing the minimum description length principle
    Grünwald, P
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 3 - 21
  • [4] A minimum description length principle for perception
    Chater, N
    [J]. ADVANCES IN MINIMUM DESCRIPTION LENGTH THEORY AND APPLICATIONS, 2005, : 385 - 409
  • [5] Cluster Validity Measures Based on the Minimum Description Length Principle
    Georgieva, Olga
    Tschumitschew, Katharina
    Klawonn, Frank
    [J]. KNOWLEDGE-BASED AND INTELLIGENT INFORMATION AND ENGINEERING SYSTEMS, PT I: 15TH INTERNATIONAL CONFERENCE, KES 2011, 2011, 6881 : 82 - 89
  • [6] The minimum description length principle in coding and modeling
    Barron, A
    Rissanen, J
    Yu, B
    [J]. IEEE TRANSACTIONS ON INFORMATION THEORY, 1998, 44 (06) : 2743 - 2760
  • [7] Incremental Learning with the Minimum Description Length Principle
    Murena, Pierre-Alexandre
    Cornuejols, Antoine
    Dessalles, Jean-Louis
    [J]. 2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1908 - 1915
  • [8] Model selection and the principle of minimum description length
    Hansen, MH
    Yu, B
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (454) : 746 - 774
  • [9] A first look at the minimum description length principle
    Grunwald, Peter D.
    [J]. INTELLIGENT ALGORITHMS IN AMBIENT AND BIOMEDICAL COMPUTING, 2006, 7 : 187 - 213
  • [10] Clustgrams: an extension to histogram densities based on the minimum description length principle
    Luosto, Panu
    Kontkanen, Petri
    [J]. OPEN COMPUTER SCIENCE, 2011, 1 (04) : 466 - 481