Learned index for non-key queries

被引:0
|
作者
Zhu, Rui [1 ]
Wang, Hongzhi [1 ]
Xia, Sheng [1 ]
Zheng, Bo [2 ]
机构
[1] Harbin Inst Technol, Comp Sci & Technol, 92 Xidazhi St, Harbin 150000, Heilongjiang, Peoples R China
[2] CnosDB, Beijing 100000, Peoples R China
基金
中国国家自然科学基金;
关键词
Bloom filter; Learned index; Non-key query; Index;
D O I
10.1007/s10115-024-02233-0
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learned indexes have attracted a lot of interest lately due to their superior performance over conventional indexes. When there is a lot of data traffic, the learned index efficiently addresses the issue of the standard index's large memory usage. In this paper, we concentrate on a well-known learned index, the recursive model index (RMI). Since the machine learning model is unbiased while calculating, when there are too many non-key queried, the model will calculate the position of the key as if it were positive key, which wastes a lot of time on unnecessary calculations. To deal with this condition, we propose a hierarchical learned index structure based on Bloom filter named HBFdex. HBFdex can effectively prune non-keys, which means most non-key return in layer of BF before they get to machine learning model. By lowering the number of layers traversed by non-key and the time spent looking for non-key within the error bound that is provided by machine learning model, HBFdex decreases the average query time of learned index. We compare HBFdex with B-Tree and RMI, and the results prove that our new structure optimizes the performance of RMI in the case of non-key queries.
引用
收藏
页码:497 / 519
页数:23
相关论文
共 50 条
  • [31] AISS: An index for non-timestamped set subsequence queries
    Andrzejewski, Witold
    Morzy, Tadeusz
    DATA WAREHOUSING AND KNOWLEDGE DISCOVERY, PROCEEDINGS, 2006, 4081 : 503 - 512
  • [32] Dictionary learning on l1-norm fidelity for non-key frames in distributed compressed video sensing
    Oishi, Tsugumi
    Kuroki, Yoshimitsu
    2019 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS), 2019,
  • [33] Improvement of non-key traits in radiata pine breeding programme when long-term economic importance is uncertain
    Li, Yongjun
    Dungey, Heidi
    Yanchuk, Alvin
    Apiolaza, Luis A.
    PLOS ONE, 2017, 12 (05):
  • [34] Learned Cardinality Estimation for Similarity Queries
    Sun, Ji
    Li, Guoliang
    Tang, Nan
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 1745 - 1757
  • [35] Learned Queries for Efficient Local Attention
    Arar, Moab
    Shamir, Ariel
    Bermano, Amit H.
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 10831 - 10842
  • [36] Genomic selection for non-key traits in radiata pine when the documented pedigree is corrected using DNA marker information
    Yongjun Li
    Jaroslav Klápště
    Emily Telfer
    Phillip Wilcox
    Natalie Graham
    Lucy Macdonald
    Heidi S. Dungey
    BMC Genomics, 20
  • [37] A Ranking Algorithm Based on Contents and Non-key Attributes for Object-level Keyword Search over Relational Databases
    Bao, Jianmin
    Wang, Huan
    Shen, Xuan
    Cui, Gang
    2014 4TH IEEE INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND TECHNOLOGY (ICIST), 2014, : 67 - 70
  • [38] Sia: Optimizing Queries using Learned Predicates
    Zhou, Qi
    Arulraj, Joy
    Navathe, Shamkant
    Harris, William
    Wu, Jinpeng
    SIGMOD '21: PROCEEDINGS OF THE 2021 INTERNATIONAL CONFERENCE ON MANAGEMENT OF DATA, 2021, : 2169 - 2181
  • [39] SOFTQE: Learned Representations of Queries Expanded by LLMs
    Pimpalkhute, Varad
    Heyer, John
    Yin, Xusen
    Gupta, Sameer
    ADVANCES IN INFORMATION RETRIEVAL, ECIR 2024, PT IV, 2024, 14611 : 68 - 77
  • [40] A Hybrid Index for Distance Queries
    Wang, Junhu
    Anirban, Shikha
    Amagasa, Toshiyuki
    Shiokawa, Hiroaki
    Gong, Zhiguo
    Islam, Md Saiful
    WEB INFORMATION SYSTEMS ENGINEERING, WISE 2020, PT I, 2020, 12342 : 227 - 241