A lower bound for dynamic approximate membership data structures

被引:16
|
作者
Lovett, Shachar [1 ]
Porat, Ely [2 ]
机构
[1] Weizmann Inst Sci, Dept Comp Sci, IL-76100 Rehovot, Israel
[2] Bar Ilan Univ, Ramat Gan, Israel
基金
以色列科学基金会; 欧洲研究理事会;
关键词
Dynamic data structures; Bloom filters; Lower bounds;
D O I
10.1109/FOCS.2010.81
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
An approximate membership data structure is a randomized data structure for representing a set which supports membership queries. It allows for a small false positive error rate but has no false negative errors. Such data structures were first introduced by Bloom in the 1970's, and have since had numerous applications, mainly in distributed systems, database systems, and networks. The algorithm of Bloom is quite effective: it can store a set S of size n by using only approximate to 1:44n log(2)(1/epsilon) bits while having false positive error epsilon. This is within a constant factor of the entropy lower bound of n log(2)(1/epsilon) for storing such sets. Closing this gap is an important open problem, as Bloom filters are widely used is situations were storage is at a premium. Bloom filters have another property: they are dynamic. That is, they support the iterative insertions of up to n elements. In fact, if one removes this requirement, there exist static data structures which receive the entire set at once and can almost achieve the entropy lower bound; they require only n log(2)(1/epsilon) (1 + o(1)) bits. Our main result is a new lower bound for the memory requirements of any dynamic approximate membership data structure. We show that for any constant epsilon > 0, any such data structure which achieves false positive error rate of epsilon must use at least C(epsilon).n log(2)(1/epsilon) memory bits, where C(epsilon) > 1 depends only on epsilon. This shows that the entropy lower bound cannot be achieved by dynamic data structures for any constant error rate. In fact, our lower bound holds even in the setting where the insertion and query algorithms may use shared randomness, and where they are only required to perform well on average.
引用
收藏
页码:797 / 804
页数:8
相关论文
共 50 条
  • [41] Approximate Polytope Membership Queries
    Arya, Sunil
    da Fonseca, Guilherme D.
    Mount, David M.
    STOC 11: PROCEEDINGS OF THE 43RD ACM SYMPOSIUM ON THEORY OF COMPUTING, 2011, : 579 - 586
  • [42] LOWER BOUND LIMIT DESIGN OF CONCRETE STRUCTURES - DISCUSSION
    SOLANKI, HT
    JOURNAL OF THE STRUCTURAL DIVISION-ASCE, 1979, 105 (10): : 2142 - 2143
  • [43] LOWER BOUND ON TIME TO INITIAL RUPTURE OF CREEPING STRUCTURES
    PONTER, ARS
    HAYHURST, DR
    JOURNAL OF MECHANICAL ENGINEERING SCIENCE, 1973, 15 (05): : 357 - 364
  • [44] Numerical lower bound shakedown analysis of engineering structures
    Simon, Jaan-Willem
    Weichert, Dieter
    COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING, 2011, 200 (41-44) : 2828 - 2839
  • [45] Application of lower bound direct method to engineering structures
    Francois, Akoa
    Abdelkader, Hachemi
    An, Le Thi Hoai
    Said, Mouhtamid
    Tao, Pham Dinh
    JOURNAL OF GLOBAL OPTIMIZATION, 2007, 37 (04) : 609 - 630
  • [46] Application of lower bound direct method to engineering structures
    Akoa François
    Hachemi Abdelkader
    Le Thi Hoai An
    Mouhtamid Said
    Pham Dinh Tao
    Journal of Global Optimization, 2007, 37 : 609 - 630
  • [47] Data Structures for Approximate Orthogonal Range Counting
    Nekrich, Yakov
    ALGORITHMS AND COMPUTATION, PROCEEDINGS, 2009, 5878 : 183 - 192
  • [48] Approximate solution of lower confidence bound on series-parallel system reliability
    Luo, Jiting
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering & Electronics, 1993, 15 (02):
  • [49] Evaluating process centering with large samples – an approximate lower bound on the accuracy index
    G.H. Lin
    The International Journal of Advanced Manufacturing Technology, 2006, 28 : 149 - 153
  • [50] An optimal randomised cell probe lower bound for approximate nearest neighbour searching
    Chakrabarti, A
    Regev, O
    45TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE, PROCEEDINGS, 2004, : 473 - 482