Two-layer partitioned and deletable deep bloom filter for large-scale membership query

被引:0
|
作者
Zeng, Meng [1 ]
Zou, Beiji [1 ]
Zhang, Wensheng [2 ]
Yang, Xuebing [2 ]
Kong, Guilan [3 ,4 ]
Kui, Xiaoyan [1 ]
Zhu, Chengzhang [5 ]
机构
[1] Cent South Univ, Sch Comp Sci & Engn, Changsha, Peoples R China
[2] Chinese Acad Sci, Inst Automat, Beijing, Peoples R China
[3] Peking Univ, Natl Inst Hlth Data Sci, Beijing, Peoples R China
[4] Peking Univ, Adv Inst Informat Technol, Hangzhou, Peoples R China
[5] Cent South Univ, Coll Literature & Journalism, Changsha, Peoples R China
基金
国家重点研发计划; 中国国家自然科学基金;
关键词
Learned bloom filter; Membership query; Deep learning; K-means cluster; Perfect hash function;
D O I
10.1016/j.is.2023.102267
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The recently proposed Learned Bloom Filter (LBF) provides a new perspective on large-scale membership queries by using machine learning to replace the traditional bloom filter. However, reducing the false positive rate (FPR) of the learned model with small memory usage, and supporting deletion efficiently become the new issues. In this paper, we propose a novel Two-layer Partitioned and Deletable Deep Bloom Filter (PDDBF) for large-scale membership query, which can reduce the FPR with small memory usage and support deletion efficiently. The proposed PDDBF consists of three main parts: (1) Data partition. To improve the classification accuracy of the learned model, the K-means cluster with the elbow method is used for the data partition. (2) Deep Bloom Filter. To reduce the FPR, deep learning models are used to construct multiple independent learning mechanisms, which correspond to the clusters obtained by part1. (3) Partitioned backup filter. To support deletion under the premise of ensuring low FPR and reducing query time consumption, combine the perfect hash (PH) table and counting bloom filters (CBFs) on the basis of the partition bloom filter. Experiments show that the proposed PDDBF reduces the FPR 87.13% with the same memory usage compared with the state-of-the-art PLBF on real-world URLs data set. Moreover, the PDDBF reduces the FPR 99.68% with the same memory usage and reduces the query time consumption to 2.61x that of the PLBF after data deletion, respectively.& COPY; 2023 Elsevier Ltd. All rights reserved.
引用
收藏
页数:13
相关论文
共 50 条
  • [1] rDBF: A r-Dimensional Bloom Filter for massive scale membership query
    Patgiri, Ripon
    Nayak, Sabuzima
    Borgohain, Samir Kumar
    [J]. JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2019, 136 : 100 - 113
  • [2] Model for the dynamics of the large-scale circulations in two-layer turbulent convection
    Sun, Yu
    Xie, Yi-Chao
    Xie, Jin-Xiao
    Zhong, Jin-Qiang
    Zhang, Jianwei
    Xia, Ke-Qing
    [J]. PHYSICAL REVIEW FLUIDS, 2024, 9 (03)
  • [3] Two-layer hierarchical control for large-scale urban traffic networks
    Kouvelas, Anastasios
    Triantafyllos, Dimitris
    Geroliminis, Nikolas
    [J]. 2018 EUROPEAN CONTROL CONFERENCE (ECC), 2018, : 1295 - 1300
  • [4] Two-layer adaptive signal control framework for large-scale networks
    Tsitsokas, Dimitrios
    Kouvelas, Anastasios
    Geroliminis, Nikolas
    [J]. 2023 31ST MEDITERRANEAN CONFERENCE ON CONTROL AND AUTOMATION, MED, 2023, : 452 - 457
  • [5] A two-layer agent-based system for large-scale distributed computation
    Uhruski, Piotr
    Grochowski, Marek
    Schaefer, Robert
    [J]. COMPUTATIONAL INTELLIGENCE, 2008, 24 (03) : 191 - 212
  • [6] Two-layer exchange flows over a dune: effect of large-scale bottom roughness
    Jose Anta
    Inés Mera
    Enrique Peña
    Andrea Louro
    [J]. Journal of Visualization, 2011, 14 : 99 - 101
  • [7] A Bloom Filter-Based Dual-Layer Routing Scheme in Large-Scale Mobile Networks
    Gao, Weichao
    Nguyen, James
    Wu, Yalong
    Hatcher, William G.
    Yu, Wei
    [J]. 2017 26TH INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND NETWORKS (ICCCN 2017), 2017,
  • [8] A two-layer frequency control method for large-scale distributed energy storage clusters
    Lin, Yujun
    Li, Xing
    Zhai, Baoyu
    Yang, Qiufan
    Zhou, Jianyu
    Chen, Xia
    Wen, Jinyu
    [J]. INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2022, 143
  • [9] A two-layer aggregation model with effective consistency for large-scale Gaussian process regression
    Wang, Wengsheng
    Zhou, Changkai
    [J]. ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2021, 106
  • [10] Two-layer exchange flows over a dune: effect of large-scale bottom roughness
    Anta, Jose
    Mera, Ines
    Pena, Enrique
    Louro, Andrea
    [J]. JOURNAL OF VISUALIZATION, 2011, 14 (02) : 99 - 101