A "Learned" Approach to Quicken and Compress Rank/Select Dictionaries

被引:0
|
作者
Boffa, Antonio [1 ]
Ferragina, Paolo [1 ]
Vinciguerra, Giorgio [1 ]
机构
[1] Univ Pisa, Dipartimento Informat, Pisa, Italy
关键词
RANK;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We address the well-known problem of designing, implementing and experimenting compressed data structures for supporting rank and select queries over a dictionary of integers. This problem has been studied far and wide since the end of the `80s with tons of important theoretical and practical results. Following a recent line of research on the so-called learned data structures, we first show that this problem has a surprising connection with the geometry of a set of points in the Cartesian plane suitably derived from the input integers. We then build upon some classical results in computational geometry to introduce the first "learned" scheme for implementing a compressed rank/select dictionary. We prove theoretical bounds on its time and space performance both in the worst case and in the case of input distributions with finite mean and variance. We corroborate these theoretical results with a large set of experiments over datasets originating from a variety of sources and applications (Web, DNA sequencing, information retrieval and natural language processing), and we show that a carefully engineered version of our approach provides new interesting space-time trade-offs with respect to several well-established implementations of Elias-Fano, RRRvector, and random-access vectors of Elias gamma /delta-coded gaps.
引用
收藏
页码:46 / 59
页数:14
相关论文
共 50 条
  • [1] A Learned Approach to Design Compressed Rank/Select Data Structures
    Boffa, Antonio
    Ferragina, Paolo
    Vinciguerra, Giorgio
    ACM TRANSACTIONS ON ALGORITHMS, 2022, 18 (03)
  • [2] Faster Practical Block Compression for Rank/Select Dictionaries
    Kaneta, Yusaku
    STRING PROCESSING AND INFORMATION RETRIEVAL (SPIRE 2017), 2017, 10508 : 234 - 240
  • [3] Rank and select: Another lesson learned
    Grabowski, Szymon
    Raniszewski, Marcin
    INFORMATION SYSTEMS, 2018, 73 : 25 - 34
  • [4] Techniques to encode and compress fault dictionaries
    Chakravarty, S
    Gopal, V
    17TH IEEE VLSI TEST SYMPOSIUM, PROCEEDINGS, 1999, : 195 - 200
  • [5] Static dictionaries supporting rank
    Raman, V
    Rao, SS
    ALGORITHMS AND COMPUTATIONS, 2000, 1741 : 18 - 26
  • [6] ULTRASOUND TOMOGRAPHY WITH LEARNED DICTIONARIES
    Tosic, Ivana
    Jovanovic, Ivana
    Frossard, Pascal
    Vetterli, Martin
    Duric, Neb
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 5502 - 5505
  • [7] Singing Voice Separation by Low-Rank and Sparse Spectrogram Decomposition with Pre-learned Dictionaries
    Yu, Shiwei
    Zhang, Hongjuan
    Duan, Zhiyao
    JOURNAL OF THE AUDIO ENGINEERING SOCIETY, 2017, 65 (05): : 377 - 388
  • [8] A deductive approach to select or rank journals in multifaceted subject, Oceanography
    Sahu, Satya Ranjan
    Panda, Krushna Chandra
    SCIENTOMETRICS, 2012, 92 (03) : 609 - 619
  • [9] A deductive approach to select or rank journals in multifaceted subject, Oceanography
    Satya Ranjan Sahu
    Krushna Chandra Panda
    Scientometrics, 2012, 92 : 609 - 619
  • [10] Sparsity-Based Approach for Ocean Acoustic Tomography Using Learned Dictionaries
    Wang, Tongchen
    Xu, Wen
    OCEANS 2016 - SHANGHAI, 2016,