Optimal Approximate Sampling from Discrete Probability Distributions

被引:3
|
作者
Saad, Feras A. [1 ]
Freer, Cameron E. [2 ]
Rinard, Martin C. [1 ]
Mansinghka, Vikash K. [2 ]
机构
[1] MIT, Dept Elect Engn & Comp Sci, Cambridge, MA 02139 USA
[2] MIT, Dept Brain & Cognit Sci, E25-618, Cambridge, MA 02139 USA
关键词
random variate generation; discrete random variables; RANDOM NUMBERS; INFORMATION; STATISTICS; ALGORITHM;
D O I
10.1145/3371104
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper addresses a fundamental problem in random variate generation: given access to a random source that emits a stream of independent fair bits, what is the most accurate and entropy-efficient algorithm for sampling from a discrete probability distribution (p(1), . . . , p(n)), where the probabilities of the output distribution ((p) over cap (1), . . . , (p) over cap (n)) of the sampling algorithm must be specified using at most k bits of precision? We present a theoretical framework for formulating this problem and provide new techniques for finding sampling algorithms that are optimal both statistically (in the sense of sampling accuracy) and information-theoretically (in the sense of entropy consumption). We leverage these results to build a system that, for a broad family of measures of statistical accuracy, delivers a sampling algorithm whose expected entropy usage is minimal among those that induce the same distribution (i.e., is "entropy-optimal") and whose output distribution ((p) over cap (1), . . . , (p) over cap (n)) is a closest approximation to the target distribution (p(1), . . . , p(n)) among all entropy-optimal sampling algorithms that operate within the specified k-bit precision. This optimal approximate sampler is also a closer approximation than any (possibly entropy-suboptimal) sampler that consumes a bounded amount of entropy with the specified precision, a class which includes floating-point implementations of inversion sampling and related methods found in many software libraries. We evaluate the accuracy, entropy consumption, precision requirements, and wall-clock runtime of our optimal approximate sampling algorithms on a broad set of distributions, demonstrating the ways that they are superior to existing approximate samplers and establishing that they often consume significantly fewer resources than are needed by exact samplers.
引用
收藏
页数:31
相关论文
共 50 条
  • [31] Efficient Sampling Methods for Discrete Distributions
    Bringmann, Karl
    Panagiotou, Konstantinos
    ALGORITHMICA, 2017, 79 (02) : 484 - 508
  • [32] SAMPLING THEOREM FOR FINITE DISCRETE DISTRIBUTIONS
    BROOKES, BC
    JOURNAL OF DOCUMENTATION, 1975, 31 (01) : 26 - 35
  • [33] RANDOM SAMPLING FROM JOINT PROBABILITY DISTRIBUTIONS DEFINED IN A BAYESIAN FRAMEWORK
    Mara, Thierry A.
    Fahs, Marwan
    Shao, Qian
    Younes, Anis
    SIAM JOURNAL ON SCIENTIFIC COMPUTING, 2019, 41 (01): : A316 - A338
  • [34] Dynamic sampling from a discrete probability distribution with a known distribution of rates
    Federico D’Ambrosio
    Hans L. Bodlaender
    Gerard T. Barkema
    Computational Statistics, 2022, 37 : 1203 - 1228
  • [35] Dynamic sampling from a discrete probability distribution with a known distribution of rates
    D'Ambrosio, Federico
    Bodlaender, Hans L.
    Barkema, Gerard T.
    COMPUTATIONAL STATISTICS, 2022, 37 (03) : 1203 - 1228
  • [36] Sampling constrained continuous probability distributions: A review
    Lan, Shiwei
    Kang, Lulu
    WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL STATISTICS, 2023, 15 (06)
  • [37] The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions
    Saad, Feras A.
    Freer, Cameron E.
    Rinard, Martin C.
    Mansinghka, Vikash K.
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 108, 2020, 108 : 1036 - 1045
  • [38] COMPARISON OF EXACT AND APPROXIMATE REACTOR NOISE PROBABILITY DISTRIBUTIONS
    SZELESS, A
    RUBY, L
    TRANSACTIONS OF THE AMERICAN NUCLEAR SOCIETY, 1970, 13 (01): : 272 - &
  • [39] Massively Parallel Construction of Radix Tree Forests for the Efficient Sampling of Discrete or Piecewise Constant Probability Distributions
    Binder, Nikolaus
    Keller, Alexander
    MONTE CARLO AND QUASI-MONTE CARLO METHODS, MCQMC 2018, 2020, 324 : 143 - 159
  • [40] Discrete analogues of continuous bivariate probability distributions
    Barbiero, Alessandro
    ANNALS OF OPERATIONS RESEARCH, 2022, 312 (01) : 23 - 43