Locality-Sensitive Bloom Filter for Approximate Membership Query

被引:48
|
作者
Hua, Yu [1 ]
Xiao, Bin [2 ]
Veeravalli, Bharadwaj [3 ]
Feng, Dan [1 ]
机构
[1] Huazhong Univ Sci & Technol, Wuhan Natl Lab Optoelect, Sch Comp Sci & Technol, Wuhan 430074, Peoples R China
[2] Hong Kong Polytech Univ, Dept Comp, Kowloon, Hong Kong, Peoples R China
[3] Natl Univ Singapore, Dept Elect & Comp Engn, Singapore 117576, Singapore
基金
中国国家自然科学基金;
关键词
Approximate membership query; bloom filters; locality sensitive hashing;
D O I
10.1109/TC.2011.108
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In many network applications, Bloom filters are used to support exact-matching membership query for their randomized space-efficient data structure with a small probability of false answers. In this paper, we extend the standard Bloom filter to Locality-Sensitive Bloom Filter (LSBF) to provide Approximate Membership Query (AMQ) service. We achieve this by replacing uniform and independent hash functions with locality-sensitive hash functions. Such replacement makes the storage in LSBF to be locality sensitive. Meanwhile, LSBF is space efficient and query responsive by employing the Bloom filter design. In the design of the LSBF structure, we propose a bit vector to reduce False Positives (FP). The bit vector can verify multiple attributes belonging to one member. We also use an active overflowed scheme to significantly decrease False Negatives (FN). Rigorous theoretical analysis (e. g., on FP, FN, and space overhead) shows that the design of LSBF is space compact and can provide accurate response to approximate membership queries. We have implemented LSBF in a real distributed system to perform extensive experiments using real-world traces. Experimental results show that LSBF, compared with a baseline approach and other state-of-the-art work in the literature (SmartStore and LSB-tree), takes less time to respond AMQ and consumes much less storage space.
引用
收藏
页码:817 / 830
页数:14
相关论文
共 50 条
  • [41] LR-PPR: Locality-Sensitive, Re-use Promoting, Approximate Personalized PageRank Computation
    Kim, Jung Hyun
    Candan, K. Selcuk
    Sapino, Maria Luisa
    [J]. PROCEEDINGS OF THE 22ND ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT (CIKM'13), 2013, : 1801 - 1806
  • [42] Toxicity prediction using locality-sensitive deep learner
    Yap, Xiu Huan
    Raymer, Michael
    [J]. COMPUTATIONAL TOXICOLOGY, 2022, 21
  • [43] ID Bloom Filter: Achieving Faster Multi-Set Membership Query in Network Applications
    Liu, Peng
    Wang, Hao
    Gao, Siang
    Yang, Tong
    Zou, Lei
    Uden, Lorna
    Li, Xiaoming
    [J]. 2018 IEEE INTERNATIONAL CONFERENCE ON COMMUNICATIONS (ICC), 2018,
  • [44] Modelling splice sites with locality-sensitive sequence features
    Liou, Sing-Wu
    Huang, Yin-Fu
    [J]. INTERNATIONAL JOURNAL OF DATA MINING AND BIOINFORMATICS, 2013, 7 (01) : 78 - 102
  • [45] Locality-Sensitive Term Weighting for Short Text Clustering
    Zheng, Chu-Tao
    Qian, Sheng
    Cao, Wen-Ming
    Wong, Hau-San
    [J]. NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 434 - 444
  • [46] Locality-Sensitive Hashing Optimizations for Fast Malware Clustering
    Oprisa, Ciprian
    Checiches, Marius
    Nandrean, Adrian
    [J]. 2014 IEEE INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTER COMMUNICATION AND PROCESSING (ICCP), 2014, : 97 - +
  • [47] A scalable Bloom filter for membership queries
    Xie, Kun
    Min, Yinghua
    Zhang, Dafang
    Wen, Jigang
    Xie, Gaogang
    [J]. GLOBECOM 2007: 2007 IEEE GLOBAL TELECOMMUNICATIONS CONFERENCE, VOLS 1-11, 2007, : 543 - +
  • [48] Locality-Sensitive Hashing for Chi2 Distance
    Gorisse, David
    Cord, Matthieu
    Precioso, Frederic
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2012, 34 (02) : 402 - 409
  • [49] Fast Redescription Mining Using Locality-Sensitive Hashing
    Karjalainen, Maiju
    Galbrun, Esther
    Miettinen, Pauli
    [J]. MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES: RESEARCH TRACK, PT VII, ECML PKDD 2024, 2024, 14947 : 124 - 142
  • [50] Locality-Sensitive Hashing Without False Negatives for lp
    Pacuk, Andrzej
    Sankowski, Piotr
    Wegrzycki, Karol
    Wygocki, Piotr
    [J]. COMPUTING AND COMBINATORICS, COCOON 2016, 2016, 9797 : 105 - 118