Two-layer partitioned and deletable deep bloom filter for large-scale membership query

被引:0
|
作者
Zeng, Meng [1 ]
Zou, Beiji [1 ]
Zhang, Wensheng [2 ]
Yang, Xuebing [2 ]
Kong, Guilan [3 ,4 ]
Kui, Xiaoyan [1 ]
Zhu, Chengzhang [5 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[3] Peking Univ, Natl Inst Hlth Data Sci, Beijing, Peoples R China
[4] Peking Univ, Adv Inst Informat Technol, Hangzhou, Peoples R China
[5] Cent South Univ, Coll Literature & Journalism, Changsha, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Learned bloom filter; Membership query; Deep learning; K-means cluster; Perfect hash function;
D O I
10.1016/j.is.2023.102267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recently proposed Learned Bloom Filter (LBF) provides a new perspective on large-scale membership queries by using machine learning to replace the traditional bloom filter. However, reducing the false positive rate (FPR) of the learned model with small memory usage, and supporting deletion efficiently become the new issues. In this paper, we propose a novel Two-layer Partitioned and Deletable Deep Bloom Filter (PDDBF) for large-scale membership query, which can reduce the FPR with small memory usage and support deletion efficiently. The proposed PDDBF consists of three main parts: (1) Data partition. To improve the classification accuracy of the learned model, the K-means cluster with the elbow method is used for the data partition. (2) Deep Bloom Filter. To reduce the FPR, deep learning models are used to construct multiple independent learning mechanisms, which correspond to the clusters obtained by part1. (3) Partitioned backup filter. To support deletion under the premise of ensuring low FPR and reducing query time consumption, combine the perfect hash (PH) table and counting bloom filters (CBFs) on the basis of the partition bloom filter. Experiments show that the proposed PDDBF reduces the FPR 87.13% with the same memory usage compared with the state-of-the-art PLBF on real-world URLs data set. Moreover, the PDDBF reduces the FPR 99.68% with the same memory usage and reduces the query time consumption to 2.61x that of the PLBF after data deletion, respectively.& COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [31] LARGE-SCALE EFFECTS OF DEEP CONVECTION ON THE GATE TROPICAL BOUNDARY-LAYER
    JOHNSON, RH
    [J]. JOURNAL OF THE ATMOSPHERIC SCIENCES, 1981, 38 (11) : 2399 - 2413
  • [32] Two-layer structure strategy for large-scale systems integrating online adaptive constraints adjustment method and cooperative distributed DMC algorithm
    Shi, Yao
    Zhang, Zhiming
    Sun, Pei
    Xie, Lei
    Chen, Qiming
    Su, Hongye
    Chen, Xiaoqiang
    [J]. CONTROL ENGINEERING PRACTICE, 2021, 116
  • [33] Deformation Characteristics and Optimization Design for Large-Scale Deep and Circular Foundation Pit Partitioned Excavation in a Complex Environment
    Shi, Hai
    Jia, Zhilei
    Wang, Tao
    Cheng, Zhiqiang
    Zhang, De
    Bai, Mingzhou
    Yu, Kun
    [J]. BUILDINGS, 2022, 12 (09)
  • [34] Channel zapping in IP over optical two-layer multicasting for large scale video delivery
    Luo, Xuan
    Jin, Yaohui
    Zeng, Qingji
    Sun, Weiqiang
    Guo, Wei
    Hu, Weisheng
    [J]. 2007 6TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATIONS & SIGNAL PROCESSING, VOLS 1-4, 2007, : 1084 - 1087
  • [35] Two-layer adaptive signal control framework for large-scale dynamically-congested networks: Combining efficient Max Pressure with Perimeter Control
    Tsitsokas, Dimitrios
    Kouvelas, Anastasios
    Geroliminis, Nikolas
    [J]. TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2023, 152
  • [36] Large-scale and green production of multi-layer graphene in deep eutectic solvents
    Bo Yang
    Shuanghong Zhang
    Jing Lv
    Shuang Li
    Yangyang Shi
    Dechao Hu
    Wenshi Ma
    [J]. Journal of Materials Science, 2021, 56 : 4615 - 4623
  • [37] Multi-objective VAr planning for large-scale power systems using projection-based two-layer simulated annealing algorithms
    Chen, YL
    Ke, YL
    [J]. IEE PROCEEDINGS-GENERATION TRANSMISSION AND DISTRIBUTION, 2004, 151 (04) : 555 - 560
  • [38] Large-scale and green production of multi-layer graphene in deep eutectic solvents
    Yang, Bo
    Zhang, Shuanghong
    Lv, Jing
    Li, Shuang
    Shi, Yangyang
    Hu, Dechao
    Ma, Wenshi
    [J]. JOURNAL OF MATERIALS SCIENCE, 2021, 56 (07) : 4615 - 4623
  • [39] Design of two-dimensional large-scale DFT-modulated filter bank
    Jiang, Jun-Zheng
    Zhou, Fang
    Ouyang, Shan
    [J]. IET SIGNAL PROCESSING, 2013, 7 (09) : 807 - 813
  • [40] Large-scale growth of few-layer two-dimensional black phosphorus
    Zehan Wu
    Yongxin Lyu
    Yi Zhang
    Ran Ding
    Beining Zheng
    Zhibin Yang
    Shu Ping Lau
    Xian Hui Chen
    Jianhua Hao
    [J]. Nature Materials, 2021, 20 : 1203 - 1209