A "Learned" Approach to Quicken and Compress Rank/Select Dictionaries

被引:0
|
作者
Boffa, Antonio [1 ]
Ferragina, Paolo [1 ]
Vinciguerra, Giorgio [1 ]
机构
[1] Univ Pisa, Dipartimento Informat, Pisa, Italy
关键词
RANK;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
We address the well-known problem of designing, implementing and experimenting compressed data structures for supporting rank and select queries over a dictionary of integers. This problem has been studied far and wide since the end of the `80s with tons of important theoretical and practical results. Following a recent line of research on the so-called learned data structures, we first show that this problem has a surprising connection with the geometry of a set of points in the Cartesian plane suitably derived from the input integers. We then build upon some classical results in computational geometry to introduce the first "learned" scheme for implementing a compressed rank/select dictionary. We prove theoretical bounds on its time and space performance both in the worst case and in the case of input distributions with finite mean and variance. We corroborate these theoretical results with a large set of experiments over datasets originating from a variety of sources and applications (Web, DNA sequencing, information retrieval and natural language processing), and we show that a carefully engineered version of our approach provides new interesting space-time trade-offs with respect to several well-established implementations of Elias-Fano, RRRvector, and random-access vectors of Elias gamma /delta-coded gaps.
引用
收藏
页码:46 / 59
页数:14
相关论文
共 50 条
  • [41] Using a Delphi Method Approach to Select Theoretical Underpinnings of Crowdsourcing and Rank Their Application to a Crowdsourcing App
    Clarke, Krystina M.
    Barari, Ahmad
    Hogue, Andrew
    Dubrowski, Adam
    SIMULATION IN HEALTHCARE-JOURNAL OF THE SOCIETY FOR SIMULATION IN HEALTHCARE, 2024, 19 (02): : 65 - 74
  • [42] Rank/select on dynamic compressed sequences and applications
    Gonzalez, Rodrigo
    Navarro, Gonzalo
    THEORETICAL COMPUTER SCIENCE, 2009, 410 (43) : 4414 - 4422
  • [43] Optimal lower bounds for rank and select indexes
    Golynski, Alexander
    THEORETICAL COMPUTER SCIENCE, 2007, 387 (03) : 348 - 359
  • [44] Rank/select queries over mutable bitmaps
    Pibiri, Giulio Ermanno
    Kanda, Shunsuke
    INFORMATION SYSTEMS, 2021, 99
  • [45] Grammar compressed sequences with rank/select support
    Ordóñez A.
    Navarro G.
    Brisaboa N.R.
    Navarro, Gonzalo (gnavarro@dcc.uchile.cl), 2017, Elsevier B.V., Netherlands (43) : 54 - 71
  • [46] Grammar Compressed Sequences with Rank/Select Support
    Navarro, Gonzalo
    Ordonez, Alberto
    STRING PROCESSING AND INFORMATION RETRIEVAL, SPIRE 2014, 2014, 8799 : 31 - 44
  • [47] Alphabet Partitioning for Compressed Rank/Select and Applications
    Barbay, Jeremy
    Gagie, Travis
    Navarro, Gonzalo
    Nekrich, Yakov
    ALGORITHMS AND COMPUTATION, PT 2, 2010, 6507 : 315 - +
  • [48] Rank-Select Indices Without Tears
    Baumann, Tim
    Hagerup, Torben
    ALGORITHMS AND DATA STRUCTURES, WADS 2019, 2019, 11646 : 85 - 98
  • [49] Optimal lower bounds for rank and select indexes
    Golynski, Alexander
    AUTOMATA, LANGUAGES AND PROGRAMMING, PT 1, 2006, 4051 : 370 - 381
  • [50] Developing a method to select and rank measures for commuters
    Esztergar-Kiss, Domokos
    Zagabria, Conrado Braga
    TRANSPORTATION RESEARCH INTERDISCIPLINARY PERSPECTIVES, 2023, 21