Pseudo-random number generation for sketch-based estimations

被引:12
|
作者
Rusu, Florin [1 ]
Dobra, Alin [1 ]
机构
[1] Univ Florida, Dept Comp & Informat Sci & Engn, Gainesville, FL 32611 USA
来源
ACM TRANSACTIONS ON DATABASE SYSTEMS | 2007年 / 32卷 / 02期
关键词
algorithms; experimentation; performance; theory; sketches; data synopses; approximate query processing; fast range-summation;
D O I
10.1145/1242524.1242528
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The exact computation of aggregate queries, like the size of join of two relations, usually requires large amounts of memory (constrained in data-streaming) or communication (constrained in distributed computation) and large processing times. In this situation, approximation techniques with provable guarantees, like sketches, are one possible solution. The performance of sketches depends crucially on the ability to generate particular pseudo-random numbers. In this article we investigate both theoretically and empirically the problem of generating k-wise independent pseudo-random numbers and, in particular, that of generating 3- and 4-wise independent pseudo-random numbers that are fast range-summable (i.e., they can be summed in sublinear time). Our specific contributions are: (a) we provide a thorough comparison of the various pseudo-random number generating schemes; (b) we study both theoretically and empirically the fast range-summation property of 3- and 4-wise independent generating schemes; (c) we provide algorithms for the fast range-summation of two 3-wise independent schemes, BCH and extended Hamming; and (d) we show convincing theoretical and empirical evidence that the extended Hamming scheme performs as well as any 4-wise independent scheme for estimating the size of join of two relations using AMS sketches, even though it is only 3-wise independent. We use this scheme to generate estimators that significantly outperform state-of-the-art solutions for two problems, namely, size of spatial joins and selectivity estimation.
引用
收藏
页数:48
相关论文
共 50 条
  • [1] PSEUDO-RANDOM NUMBER GENERATION AND SPACE COMPLEXITY
    FURST, M
    LIPTON, R
    STOCKMEYER, L
    LECTURE NOTES IN COMPUTER SCIENCE, 1983, 158 : 171 - 176
  • [2] Efficient parallel pseudo-random number generation
    Tan, CJK
    PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED PROCESSING TECHNIQUES AND APPLICATIONS, VOLS I-V, 2000, : 309 - 314
  • [3] Pseudo-random number generation using LSTMs
    Young-Seob Jeong
    Kyo-Joong Oh
    Chung-Ki Cho
    Ho-Jin Choi
    The Journal of Supercomputing, 2020, 76 : 8324 - 8342
  • [4] Pseudo-random number generation using LSTMs
    Jeong, Young-Seob
    Oh, Kyo-Joong
    Cho, Chung-Ki
    Choi, Ho-Jin
    JOURNAL OF SUPERCOMPUTING, 2020, 76 (10): : 8324 - 8342
  • [5] Pseudo-Random Number Generation on GP-GPU
    Passerat-Palmbach, Jonathan
    Mazel, Claude
    Hill, David R. C.
    2011 IEEE WORKSHOP ON PRINCIPLES OF ADVANCED AND DISTRIBUTED SIMULATION (PADS), 2011,
  • [6] Evaluation of Pseudo-Random Number Generation on GPU Cards
    Askar, Tair
    Shukirgaliyev, Bekdaulet
    Lukac, Martin
    Abdikamalov, Ernazar
    COMPUTATION, 2021, 9 (12)
  • [7] SOME NEW RESULTS IN PSEUDO-RANDOM NUMBER GENERATION
    VANGELDER, A
    JOURNAL OF THE ACM, 1967, 14 (04) : 785 - &
  • [8] Exploring quantum systems for pseudo-random number generation
    Luis José Mantilla Santa Cruz
    Luis Fernando Faina
    João Henrique de Souza Pereira
    Quantum Studies: Mathematics and Foundations, 2025, 12 (1)
  • [9] A pseudo-random number generator based on LZSS
    Chang, Weiling
    Fang, Binxing
    Yun, Xiaochun
    Wang, Shupeng
    Yu, Xiangzhan
    2010 DATA COMPRESSION CONFERENCE (DCC 2010), 2010, : 524 - 524
  • [10] Pseudo-random number generation based on digit isolation referenced to entropy buffers
    Richardson, Joseph D.
    SIMULATION-TRANSACTIONS OF THE SOCIETY FOR MODELING AND SIMULATION INTERNATIONAL, 2022, 98 (05): : 389 - 406