Efficient Rejection Sampling in the Entropy-Optimal Range
Pith reviewed 2026-05-22 21:24 UTC · model grok-4.3
The pith
A new rejection sampling algorithm generates samples from any finite discrete distribution P using an expected number of coin flips in the range [H(P), H(P)+2) while using linearithmic space and low per-sample time.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a carefully constructed rejection sampling procedure can realize the Knuth-Yao entropy-optimal bound of expected coin flips between H(P) and H(P)+2 while storing only linearithmic space and incurring negligible overhead per generated sample. This is shown to outperform the alias method in both runtime and entropy use on numerical experiments for arbitrary discrete distributions.
What carries the argument
A rejection sampling algorithm whose precomputed structures are sized to fit in linearithmic space while keeping the rejection steps from exceeding the entropy-optimal coin-flip bound.
If this is right
- The method can replace the alias method in applications where both runtime and entropy efficiency matter.
- Any finite discrete distribution can now be sampled with entropy-optimal coin flips without exponential space.
- The approach demonstrates measurable gains in both speed and entropy consumption over the alias method in experiments.
Where Pith is reading between the lines
- The technique may apply to other randomness sources beyond fair coins if the entropy accounting can be adjusted similarly.
- Hardware implementations of random number generators could adopt this to minimize power use per sample.
- The linearithmic space bound suggests the method remains feasible for distributions with millions of outcomes on modern machines.
Load-bearing premise
The specific construction of the precomputed tables and rejection steps can be realized in linearithmic space without the expected number of coin flips exceeding H(P)+2.
What would settle it
An implementation for a concrete distribution such as a power-law with n=10^5 outcomes that reports measured space usage exceeding O(n log n) bits or average coin flips exceeding H(P)+2 over 10^7 samples would falsify the claim.
Figures
read the original abstract
We study the problem of generating a random variate $X$ from a finite discrete probability distribution $P$ using an entropy source of independent fair coin flips. A classic result from Knuth and Yao shows that the optimal expected number of input coin flips per output sample lies between $H(P)$ and $H(P)\,{+}\,2$, where $H$ is the Shannon entropy function. However, implementing the Knuth and Yao ``entropy-optimal'' sampler entails a tradeoff between using either exponential space with low runtime per sample, or linear space with high runtime per sample. We introduce a new sampling algorithm that avoids this tradeoff: it requires linearithmic space, incurs negligible runtime overhead per sample, and uses an expected number of coin flips that lies in the entropy-optimal range $[H(P), H(P)\,{+}\,2)$. No previous sampler for discrete distributions simultaneously achieves these space, time, and entropy characteristics. Numerical experiments demonstrate improvements in runtime and entropy of the proposed method compared to the celebrated alias method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a new rejection sampling algorithm for generating a random variate from a finite discrete distribution P using fair coin flips. It claims this algorithm requires only linearithmic space, incurs negligible per-sample runtime overhead, and achieves an expected number of coin flips strictly in the entropy-optimal interval [H(P), H(P)+2), thereby avoiding the space-time tradeoff inherent in the Knuth-Yao sampler while also outperforming the alias method in numerical experiments on runtime and entropy.
Significance. If the construction and bounds are rigorously established, the result would be significant for discrete sampling in computer science, as it would supply a practical method that simultaneously satisfies the three key criteria of space efficiency, time efficiency, and near-optimal entropy usage; such a combination has been elusive and would be useful in applications involving large or arbitrary discrete distributions.
major comments (2)
- [Abstract / Algorithm description] The central claim that a concrete construction exists achieving linearithmic space while keeping expected flips inside [H(P), H(P)+2) rests on an unshown algorithm and proof; no explicit definition of the precomputed tables, rejection mechanism, or entropy analysis appears in the visible text to support the assertion.
- [Experiments] Numerical experiments are invoked to demonstrate runtime and entropy improvements over the alias method, yet no tables, figures, specific distributions, or quantitative results are supplied, leaving the empirical support for the practical claims unverifiable.
minor comments (1)
- [Abstract] The abstract states performance properties without citing the section or theorem that establishes them; adding such pointers would improve readability.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive comments. We address the major comments point by point below.
read point-by-point responses
-
Referee: [Abstract / Algorithm description] The central claim that a concrete construction exists achieving linearithmic space while keeping expected flips inside [H(P), H(P)+2) rests on an unshown algorithm and proof; no explicit definition of the precomputed tables, rejection mechanism, or entropy analysis appears in the visible text to support the assertion.
Authors: The full manuscript (Sections 3 and 4) provides the explicit algorithm, precomputed tables, rejection mechanism, and entropy analysis establishing the linearithmic-space construction with expected flips in [H(P), H(P)+2). We agree the abstract and introduction could better highlight these elements for readers who do not immediately consult the technical sections. We will add a concise high-level overview of the tables and rejection rule in the introduction. revision: yes
-
Referee: [Experiments] Numerical experiments are invoked to demonstrate runtime and entropy improvements over the alias method, yet no tables, figures, specific distributions, or quantitative results are supplied, leaving the empirical support for the practical claims unverifiable.
Authors: We agree the experimental results must be presented with full documentation. The manuscript contains the experiments, but the submitted version omitted the supporting tables, figures, and quantitative details. We will include these in the revision, specifying the distributions tested and reporting the measured runtime and entropy values versus the alias method. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper presents a novel rejection sampling algorithm whose space, time, and entropy bounds are asserted via an explicit construction (detailed in the full manuscript) that is compared against the external Knuth-Yao theorem. No load-bearing step reduces by the paper's own equations or self-citation to a tautological redefinition, a fitted parameter renamed as a prediction, or an ansatz imported from the authors' prior work. The central claims rest on the new construction and numerical comparisons to the alias method rather than on any self-referential loop.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption An entropy source of independent fair coin flips is available.
Reference graph
Works this paper leans on
-
[1]
The complexity of nonuniform random number generation,
D. E. Knuth and A. C. Yao, “The complexity of nonuniform random number generation,” in Algorithms and Complexity: New Directions and Recent Results , J. F. Traub, Ed., Orlando, FL: Academic Press, Inc., 1976, pp. 357–428
work page 1976
-
[2]
G. S. Fishman, Monte Carlo: Concepts, Algorithms, and Applications (Springer Series in Operations Re- search). New York: Springer-Verlag, 1996. DOI: 10.1007/978-1-4757-2553-7
-
[3]
J. J. Faran, “Random noise generators,” General Radio Experimenter , vol. 42, no. 1, pp. 1–13, Jan. 1968
work page 1968
-
[4]
Random and pseudorandom sequences,
R. T. Kneusel, “Random and pseudorandom sequences,” in Random Numbers and Computers. Cham: Springer, 2018, ch. 1, pp. 1–25. DOI: 10.1007/978-3-319-77697-2_1
-
[5]
T. M. Cover and J. A. Thomas, Elements of Information Theory , 2nd ed. Hoboken: John Wiley & Sons, Inc.,
-
[6]
DOI: 10.1002/047174882X
-
[7]
High precision discrete Gaussian sampling on FPGAs,
S. S. Roy, F. Vercauteren, and I. Verbauwhede, “High precision discrete Gaussian sampling on FPGAs,” in Proceedings of the 20th International Conference on Selected Areas in Cryptography , ser. Lecture Notes in Computer Science, vol. 8282, Berlin: Springer, 2013, pp. 383–401. DOI: 10.1007/978-3-662-43414-7_19
-
[8]
Sampling from discrete Gaussians for lattice-based cryptography on a constrained device,
N. C. Dwarakanath and S. D. Galbraith, “Sampling from discrete Gaussians for lattice-based cryptography on a constrained device,” Applicable Algebra in Engineering, Communication and Computing , vol. 25, no. 3, pp. 159–180, Jun. 2014. DOI: 10.1007/s00200-014-0218-3
-
[9]
Gaussian sampling in lattice based cryptography,
J. Folláth, “Gaussian sampling in lattice based cryptography,” Tatra Mountains Mathematical Publications , vol. 60, no. 1, pp. 1–23, Sep. 2014. DOI: 10.2478/tmmp-2014-0022
-
[10]
Efficient implementation of Knuth Yao sampler on reconfigurable hardware,
P. Baidya, R. Paul, S. Mandal, and S. K. Debnath, “Efficient implementation of Knuth Yao sampler on reconfigurable hardware,” IEEE Computer Architecture Letters, vol. 23, no. 2, pp. 195–198, Sep. 2024. DOI: 10.1109/LCA.2024.3454490
-
[11]
Optimal approximate sampling from discrete probability distributions,
F. A. Saad, C. E. Freer, M. C. Rinard, and V . K. Mansinghka, “Optimal approximate sampling from discrete probability distributions,” Proceedings of the ACM on Programming Languages, vol. 4, no. POPL, Jan. 2020. DOI: 10.1145/3371104
-
[12]
L. Devroye, Non-Uniform Random Variate Generation. New York: Springer-Verlag, 1986. DOI: 10.1007/978- 1-4613-8643-8
-
[13]
B. D. Ripley, Stochastic Simulation (Wiley Series in Probability and Mathematical Statistics). New York: John Wiley & Sons, 1987. DOI: 10.1002/9780470316726
-
[14]
W. Hörmann, J. Leydold, and G. Derflinger, Automatic Nonuniform Random Variate Generation (Statistics and Computing). Berlin: Springer-Verlag, 2004. DOI: 10.1007/978-3-662-05946-3
-
[15]
Swartz, Darts, dice, and coins: Sampling from a discrete distribution , Dec
K. Swartz, Darts, dice, and coins: Sampling from a discrete distribution , Dec. 2011. [Online]. Available: https://www.keithschwarz.com/darts-dice-coins/
work page 2011
-
[16]
Leydold, UNU.RAN—Universal non-uniform random number generators , Nov
J. Leydold, UNU.RAN—Universal non-uniform random number generators , Nov. 2009. [Online]. Available: https://statmath.wu.ac.at/unuran/. 35
work page 2009
-
[17]
M. Galassi, J. Davies, J. Theiler, et al., GNU Scientific Library Reference Manual, Third. Surrey, UK: Network Theory Ltd., 2009, ISBN : 0954612078. [Online]. Available: http://www.gnu.org/software/gsl/
work page 2009
-
[18]
Lea, User’s guide to the GNU C++ library , version 2.0, Free Software Foundation, Inc., 1992
D. Lea, User’s guide to the GNU C++ library , version 2.0, Free Software Foundation, Inc., 1992
work page 1992
-
[19]
Approximation theory of output statistics,
T. S. Han and S. Verdú, “Approximation theory of output statistics,” IEEE Transactions on Information Theory, vol. 39, no. 3, pp. 752–772, May 1993. DOI: 10.1109/18.256486
-
[20]
Various techniques used in connection with random digits,
J. von Neumann, “Various techniques used in connection with random digits,” in Monte Carlo Method , ser. National Bureau of Standards Applied Mathematics Series 12, A. S. Householder, G. E. Forsythe, and H. H. Germond, Eds., Washington, DC: U.S. Government Printing Office, Jun. 1951, ch. 13, pp. 36–38
work page 1951
-
[21]
Interval algorithm for random number generation,
T. S. Han and M. Hoshi, “Interval algorithm for random number generation,” IEEE Transactions on Infor- mation Theory, vol. 43, no. 2, pp. 599–611, Mar. 1997. DOI: 10.1109/18.556116
-
[22]
Probabilistic Turing machines and complexity of computation,
J. T. Gill, “Probabilistic Turing machines and complexity of computation,” Ph.D. dissertation, University of California, Berkeley, 1972, ch. II
work page 1972
-
[23]
The fast loaded dice roller: A near-optimal exact sampler for discrete probability distributions,
F. A. Saad, C. E. Freer, M. C. Rinard, and V . K. Mansinghka, “The fast loaded dice roller: A near-optimal exact sampler for discrete probability distributions,” in Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics , ser. Proceedings of Machine Learning Research, vol. 108, PMLR, 2020
work page 2020
-
[24]
An efficient method for generating discrete random variables with general distributions,
A. J. Walker, “An efficient method for generating discrete random variables with general distributions,” ACM Transactions on Mathematical Software, vol. 3, no. 3, pp. 253–256, Sep. 1977. DOI: 10.1145/355744.355749
-
[25]
A linear algorithm for generating random numbers with a given distribution,
M. D. V ose, “A linear algorithm for generating random numbers with a given distribution,” IEEE Transactions on Software Engineering , vol. 17, no. 9, pp. 972–975, Sep. 1991. DOI: 10.1109/32.92917
-
[26]
Two algorithms for random number generation implemented by using arithmetic of limited precision,
T. Uyematsu and Y . Li, “Two algorithms for random number generation implemented by using arithmetic of limited precision,” IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. 86, no. 10, pp. 2542–2551, Oct. 2003
work page 2003
-
[27]
Optimal Discrete Uniform Generation from Coin Flips, and Applications
J. Lumbroso, “Optimal discrete uniform generation from coin flips, and applications,” arXiv, no. 1304.1916, Apr. 2013. DOI: 10.48550/arXiv.1304.1916
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1304.1916 1916
-
[28]
Optimal rolling of fair dice using fair coins,
M. Huber and D. Vargas, “Optimal rolling of fair dice using fair coins,” arXiv, no. 2412.20700, Dec. 2024. DOI: 10.48550/arXiv.2412.20700
-
[29]
Random variate generation with formal guarantees,
F. A. Saad and W. Lee, “Random variate generation with formal guarantees,” Proceedings of the ACM on Programming Languages, vol. 9, no. PLDI, Jun. 2025, Forthcoming
work page 2025
-
[30]
Generating discrete random variables in a computer,
G. Marsaglia, “Generating discrete random variables in a computer,” Communications of the ACM , vol. 6, no. 1, pp. 37–38, Jan. 1963. DOI: 10.1145/366193.366228
-
[31]
L. Devroye and C. Gravel, “Random variate generation using only finitely many unbiased, independently and identically distributed random bits,” arXiv, no. 1502.02539v6, Nov. 2020. DOI: 10.48550/arXiv.1502.02539
-
[32]
M. I. Shamos, “Computational geometry,” Ph.D. dissertation, Yale University, 1978
work page 1978
-
[33]
Computing over the reals: Where Turing meets Newton,
L. Blum, “Computing over the reals: Where Turing meets Newton,” Notices of the American Mathematical Society, vol. 51, no. 9, pp. 1024–1034, Oct. 2004. 36
work page 2004
-
[34]
About and around computing over the reals,
S. Feferman, “About and around computing over the reals,” in Computability: Turing, Gödel, Church, and Beyond, B. J. Copeland, C. J. Posy, and O. Shagrir, Eds., MIT Press, Jun. 2013. DOI: 10.7551/mitpress/ 8009.003.0004
-
[35]
S. Zimmerman, “An optimal search procedure,” The American Mathematical Monthly, vol. 66, no. 8, pp. 690– 693, Oct. 1959. DOI: 10.1080/00029890.1959.11989389
-
[36]
A computer program for the generation of random variables from any discrete distribution,
J. E. Norman and L. E. Cannon, “A computer program for the generation of random variables from any discrete distribution,” Journal of Statistical Computation and Simulation , vol. 1, no. 4, pp. 331–348, 1972. DOI: 10.1080/00949657208810026
-
[37]
On generating random variates from an empirical distribution,
H.-C. Chen and Y . Asau, “On generating random variates from an empirical distribution,” AIIE Transactions, vol. 6, no. 2, pp. 163–166, Jun. 1974. DOI: 10.1080/05695557408974949
-
[38]
Generating random floating-point numbers by dividing integers: A case study,
F. Goualard, “Generating random floating-point numbers by dividing integers: A case study,” in Computational Science, ser. Lecture Notes in Computer Science, vol. 12138, Cham: Springer, 2020, pp. 15–28. DOI: 10. 1007/978-3-030-50417-5_2
work page 2020
-
[39]
The efficient construction of an unbiased random sequence,
P. Elias, “The efficient construction of an unbiased random sequence,” Annals of Mathematical Statistics , vol. 43, no. 3, pp. 865–870, Jun. 1972. DOI: 10.1214/aoms/1177692552
-
[40]
Generation of discrete distributions from biased coins,
J. Abrahams, “Generation of discrete distributions from biased coins,” IEEE Transactions on Information Theory, vol. 42, no. 5, pp. 1541–1546, Sep. 1996. DOI: 10.1109/18.532895
-
[41]
Efficient generation of random variables from biased coins,
J. R. Roche, “Efficient generation of random variables from biased coins,” in Proceedings of the IEEE International Symposium on Information Theory , Piscataway: IEEE Press, 1991, pp. 169–169. DOI: 10.1109/ ISIT.1991.695225
-
[42]
Random number generation using a biased source,
S.-I. Pae, “Random number generation using a biased source,” Ph.D. dissertation, University of Illinois at Urbana-Champaign, 2005. DOI: 2142/81665
work page 2005
-
[43]
D. Kozen, “Optimal coin flipping,” in Horizons of the Mind. A Tribute to Prakash Panangaden: Essays Dedicated to Prakash Panangaden on the Occasion of His 60th Birthday , ser. Lecture Notes in Computer Science, vol. 8464, Cham: Springer, 2014, pp. 407–426. DOI: 10.1007/978-3-319-06880-0_21
-
[44]
A generalization of Peres’s algorithm for generating random bits from loaded dice,
S.-I. Pae, “A generalization of Peres’s algorithm for generating random bits from loaded dice,” IEEE Trans- actions on Information Theory , vol. 61, no. 2, pp. 751–757, Feb. 2015. DOI: 10.1109/TIT.2014.2381223
-
[45]
Binarization trees and random number generation,
S.-I. Pae, “Binarization trees and random number generation,” IEEE Transactions on Information Theory , vol. 66, no. 4, pp. 2581–2587, Apr. 2020. DOI: 10.1109/TIT.2019.2962480
-
[46]
Coalgebraic tools for randomness-conserving protocols,
D. Kozen and M. Soloviev, “Coalgebraic tools for randomness-conserving protocols,” in Proceedings of the 17th International Conference on Relational and Algebraic Methods in Computer Science , ser. Lecture Notes in Computer Science, vol. 11194, Cham: Springer, 2018, pp. 298–313. DOI: 10.1007/978-3-030-02149-8_18
-
[47]
Unbiased coin tossing with a biased coin,
W. Hoeffding and G. Simons, “Unbiased coin tossing with a biased coin,” The Annals of Mathematical Statistics, vol. 41, no. 2, pp. 341–352, Apr. 1970. DOI: 10.1214/aoms/1177697074
-
[48]
Volterra Equations Driven by Semimartingales
Q. F. Stout and B. Warren, “Tree algorithms for unbiased coin tossing with a biased coin,” The Annals of Probability, vol. 12, no. 1, pp. 212–222, Feb. 1984. DOI: 10.1214/aop/1176993384. 37
-
[49]
Fairing of biased coins in bounded time,
J. D. Cohen, “Fairing of biased coins in bounded time,” Yale University, Tech. Rep. Y ALEU/DCS/TR372, Mar. 1985
work page 1985
-
[50]
Iterating von Neumann’s procedure for extracting random bits,
Y . Peres, “Iterating von Neumann’s procedure for extracting random bits,” Annals of Statistics, vol. 20, no. 1, pp. 590–597, Mar. 1992. DOI: 10.1214/aos/1176348543
-
[51]
S. Pae and M. C. Loui, “Randomizing functions: Simulation of a discrete probability distribution using a source of unknown distribution,” IEEE Transactions on Information Theory , vol. 52, no. 11, pp. 4965–4976, Nov. 2006. DOI: 10.1109/TIT.2006.883555
-
[52]
Independent unbiased coin flips from a correlated biased source: A finite state Markov chain,
M. Blum, “Independent unbiased coin flips from a correlated biased source: A finite state Markov chain,” Combinatorica, vol. 6, no. 2, pp. 97–108, Jun. 1986. DOI: 10.1007/BF02579167
-
[53]
A note on approximation of uniform distributions from variable- to-fixed length codes,
F. Cicalese, L. Gargano, and U. Vaccaro, “A note on approximation of uniform distributions from variable- to-fixed length codes,” IEEE Transactions on Information Theory , vol. 52, no. 8, pp. 3772–3777, Aug. 2006. DOI: 10.1109/TIT.2006.878151
-
[54]
Generating random bits from an arbitrary source: Fundamental limits,
S. Vembu and S. Verdú, “Generating random bits from an arbitrary source: Fundamental limits,” IEEE Transactions on Information Theory , vol. 41, no. 5, pp. 1322–1332, Sep. 1995. DOI: 10.1109/18.412679
-
[55]
Optimal quantization for distribution synthesis,
G. Böcherer and B. C. Geiger, “Optimal quantization for distribution synthesis,” IEEE Transactions on Information Theory, vol. 62, no. 11, pp. 6162–6172, Nov. 2016. DOI: 10.1109/TIT.2016.2610433
-
[56]
The Bell System Tech- nical Journal27(3), 379–423 (1948) https://doi.org/10.1002/j.1538-7305.1948
C. E. Shannon, “A mathematical theory of communication,” The Bell System Technical Journal , vol. 27, 3 Jul. 1948. DOI: 10.1002/j.1538-7305.1948.tb01338.x
-
[57]
Bounds for entropy and divergence for distributions over a two-element set,
F. Topsøe, “Bounds for entropy and divergence for distributions over a two-element set,” Journal of Inequal- ities in Pure & Applied Mathematics , vol. 2, no. 2, Paper No. 25, 2001
work page 2001
-
[58]
Integer multiplication in time 𝑂 (𝑛 log 𝑛),
D. Harvey and J. Van Der Hoeven, “Integer multiplication in time 𝑂 (𝑛 log 𝑛),” Annals of Mathematics , vol. 193, no. 2, pp. 563–617, Mar. 2021. DOI: 10.4007/annals.2021.193.2.4
-
[59]
Multiplication of many-digital numbers by automatic computers,
A. Karatsuba and Y . Ofman, “Multiplication of many-digital numbers by automatic computers,” Proceedings of the USSR Academy of Sciences , vol. 145, no. 2, pp. 293–294, 1962
work page 1962
- [60]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.