Exact MCMC for Intractable Proposals
Pith reviewed 2026-05-23 19:25 UTC · model grok-4.3
The pith
Adapting Bernoulli factory methods yields exact MCMC sampling for proposals with unknown normalizing constants.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We show how Bernoulli factory MCMC algorithms, originally proposed for doubly intractable target distributions, can quite naturally be adapted to yield an exact MCMC sampling method for proposals with unknown normalizing constants. This is achieved by expressing the acceptance probability ratio in a form suitable for Bernoulli factory construction, allowing exact simulation of the accept/reject decision.
What carries the argument
A Bernoulli factory constructed for the acceptance probability ratio in the Metropolis-Hastings algorithm when the proposal has an intractable normalizing constant
If this is right
- Provides an exact alternative to approximate MCMC methods for intractable proposals.
- Enables use of a broader class of proposal distributions in Bayesian inference.
- Demonstrated through three diverse examples showing practical applicability.
- Eliminates the need for specialized algorithms to handle each type of intractable proposal.
Where Pith is reading between the lines
- The technique might generalize to other accept-reject based samplers beyond standard MCMC.
- It could lead to more flexible proposal designs in complex models where normalizing constants are hard to compute.
- Efficiency comparisons with approximation methods would be a natural next step for implementation.
Load-bearing premise
The acceptance probability ratio involving the intractable proposal normalizing constant can be expressed or bounded in a form that permits an exact Bernoulli factory.
What would settle it
Finding a target and proposal pair where the ratio cannot be bounded or expressed suitably for any Bernoulli factory, causing the method to fail in producing exact samples.
Figures
read the original abstract
Accept-reject based Markov chain Monte Carlo (MCMC) methods are the workhorse algorithm for Bayesian inference. These algorithms, like Metropolis-Hastings, require choosing a proposal distribution which is typically informed by the desired target distribution. Surprisingly, proposal distributions with unknown normalizing constants are not uncommon, even though for such a choice of a proposal, the Metropolis-Hastings acceptance ratio cannot be evaluated exactly. Across the literature, authors resort to approximation methods that yield inexact MCMC or develop specialized algorithms to combat this problem. We show how Bernoulli factory MCMC algorithms, originally proposed for doubly intractable target distributions, can quite naturally be adapted to yield an exact MCMC sampling method. We present three diverse and relevant examples demonstrating the usefulness of the Bernoulli factory approach to this problem.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that Bernoulli factory MCMC methods, developed for doubly intractable targets, can be adapted to Metropolis-Hastings proposals whose normalizing constants are intractable, thereby producing an exact MCMC algorithm without approximation. The adaptation is illustrated through three diverse examples.
Significance. If the adaptation is rigorously justified, the result would be useful because it supplies an exact sampling route for a practically relevant class of proposals where authors currently rely on approximations or specialized constructions. The reuse of existing Bernoulli factory machinery is a strength when the required acceptance probability admits an exact factory representation.
major comments (2)
- [Abstract and §3 (examples)] The central claim rests on the assertion that the acceptance ratio (including the ratio of intractable proposal normalizers Z(x)/Z(x')) can be expressed in a form that admits an exact Bernoulli factory. The manuscript must explicitly state the conditions on the proposal family under which this representation exists and verify it for each of the three examples; otherwise the exactness guarantee is conditional rather than general.
- [§2 (method)] It is unclear whether the Bernoulli factory construction preserves the correct stationary distribution when the factory is only an unbiased estimator of the acceptance probability rather than an exact indicator. A short derivation showing that the overall transition kernel still satisfies detailed balance would strengthen the exactness claim.
minor comments (2)
- Notation for the intractable proposal normalizer should be introduced once and used consistently; the abstract uses Z(x) while later sections appear to switch between Z and other symbols.
- The three examples would benefit from a short table comparing wall-clock time or effective sample size against a standard approximate alternative (e.g., pseudo-marginal or noisy MH) to quantify the practical overhead of the exact factory.
Simulated Author's Rebuttal
We thank the referee for the thoughtful report and the opportunity to clarify the exactness guarantees in our work. We address each major comment below and will revise the manuscript to incorporate the requested details.
read point-by-point responses
-
Referee: [Abstract and §3 (examples)] The central claim rests on the assertion that the acceptance ratio (including the ratio of intractable proposal normalizers Z(x)/Z(x')) can be expressed in a form that admits an exact Bernoulli factory. The manuscript must explicitly state the conditions on the proposal family under which this representation exists and verify it for each of the three examples; otherwise the exactness guarantee is conditional rather than general.
Authors: We agree that the conditions should be stated explicitly to make the exactness claim unconditional within the relevant class. In the revision we will add a short subsection to §2 that specifies the general conditions on the proposal family (namely, that the ratio of normalizers Z(x)/Z(x') together with the target ratio must admit a Bernoulli factory representation, which holds when the ratio can be written as a product or ratio of functions that are pointwise computable and bounded in [0,1] after suitable scaling). We will then verify these conditions explicitly for each of the three examples in §3, confirming that the required factories exist and can be implemented without approximation. revision: yes
-
Referee: [§2 (method)] It is unclear whether the Bernoulli factory construction preserves the correct stationary distribution when the factory is only an unbiased estimator of the acceptance probability rather than an exact indicator. A short derivation showing that the overall transition kernel still satisfies detailed balance would strengthen the exactness claim.
Authors: We will add the requested derivation in §2. The key point is that a Bernoulli factory, when it terminates, returns an exact Bernoulli random variable whose success probability equals the target acceptance probability p; it is not merely an unbiased estimator of p but an exact sampler from Bernoulli(p). Consequently the transition kernel is identical to the standard Metropolis-Hastings kernel that uses this p, and detailed balance follows from the usual argument once the factory has produced its exact 0/1 outcome. The derivation will be kept concise and will emphasize that no approximation enters the kernel. revision: yes
Circularity Check
No significant circularity
full rationale
The paper presents an adaptation of existing Bernoulli factory MCMC methods (originally for doubly intractable targets) to proposals with intractable normalizing constants. The central claim of exact MCMC follows from the external properties of Bernoulli factories applied to the acceptance ratio, with three examples given. No derivation reduces by construction to fitted inputs, self-definitions, or self-citation chains; the approach is an extension rather than a self-referential result. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Adams, R., Murray, I., and Mackay, D. (2009). Tractable nonparametric bayesian inference in poisson processes with gaussian process intensities. Proceedings of the 26th International Conference on Machine Learning , 382:2
work page 2009
-
[2]
Agrawal, S., Vats, D., atuszy \'n ski, K., and Roberts, G. O. (2023). Optimal scaling of MCMC beyond M etropolis. Advances in Applied Probability , 55(2):492--509
work page 2023
-
[3]
Barker, A. A. (1965). Monte C arlo calculations of the radial distribution functions for a proton? electron plasma. Australian Journal of Physics , 18(2):119--134
work page 1965
-
[4]
Botev, Z. I. (2016). The normal law under linear restrictions: Simulation and estimation via minimax tilting. Journal of the Royal Statistical Society Series B: Statistical Methodology , 79(1):125–148
work page 2016
-
[5]
Chen, N. and Campbell, T. (2024). Coreset markov chain M onte C arlo. In International Conference on Artificial Intelligence and Statistics , pages 4438--4446. PMLR
work page 2024
-
[6]
Cody, W. J. (1969). Rational C hebyshev approximations for the error function. Mathematics of Computation , 23(107):631--637
work page 1969
-
[7]
Dunson, D. B. and Johndrow, J. E. (2020). The H astings algorithm at fifty. Biometrika , 107(1):1--23
work page 2020
-
[8]
M., Hughes, J., Vats, D., Dai, N., Gupta, K., and Maji, U
Flegal, J. M., Hughes, J., Vats, D., Dai, N., Gupta, K., and Maji, U. (2021). mcmcse: Monte Carlo Standard Errors for MCMC . Riverside, CA, and Kanpur, India. R package version 1.5-0
work page 2021
-
[9]
B., Łatuszyński, K., and Roberts, G
Gonçalves, F. B., Łatuszyński, K., and Roberts, G. O. (2017). Barker's algorithm for Bayesian inference with intractable likelihoods
work page 2017
-
[10]
Herbei, R. and Berliner, L. M. (2014). Estimating ocean circulation: an MCMC approach with approximated likelihoods via the B ernoulli factory. Journal of the American Statistical Association , 109(507):944--954
work page 2014
-
[11]
Ihler, A., Fisher, J., Moses, R., and Willsky, A. (2005). Nonparametric belief propagation for self-localization of sensor networks. IEEE Journal on Selected Areas in Communications , 23(4):809--819
work page 2005
-
[12]
Keane, M. and O'Brien, G. L. (1994). A B ernoulli factory. ACM Transactions on Modeling and Computer Simulation (TOMACS) , 4(2):213--219
work page 1994
-
[13]
atuszy \'n ski, K., Kosmidis, I., Papaspiliopoulos, O., and Roberts, G. O. (2011). Simulating events of unknown probabilities via reverse time M artingales. Random Structures & Algorithms , 38(4):441--452
work page 2011
-
[14]
atuszy \'n ski, K. and Roberts, G. O. (2013). Clts and asymptotic variance of time-sampled M arkov chains. Methodology and Computing in Applied Probability , 15(1):237--247
work page 2013
-
[15]
Livingstone, S. and Zanella, G. (2022). The B arker proposal: combining robustness and efficiency in gradient-based MCMC . Journal of the Royal Statistical Society Series B: Statistical Methodology , 84(2):496--523
work page 2022
-
[16]
L \'o pez-Lopera, A. F. (2018). lineqgpr: Gaussian process regression models with linear inequality constraints. R package version 0.0 , 3
work page 2018
-
[17]
Lopez-Lopera, A. F., John, S., and Durrande, N. (2019). Gaussian process modulated cox processes under linear inequality constraints. In Chaudhuri, K. and Sugiyama, M., editors, Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics , volume 89 of Proceedings of Machine Learning Research , pages 1997--2006. PMLR
work page 2019
-
[18]
M ller, J., Pettitt, A., Reeves, R., and Berthelsen, K. (2006). An efficient markov chain monte carlo method for distributions with intractable normalising constants. Biometrika , 93(2):451--458
work page 2006
-
[19]
Neal, R. M. (2011). MCMC Using H amiltonian Dynamics . Chapman and Hall/CRC
work page 2011
-
[20]
Peskun, P. H. (1973). Optimum M onte- C arlo sampling using M arkov chains. Biometrika , 60(3):607--612
work page 1973
-
[21]
Roberts, G. O. and Rosenthal, J. S. (2001). Optimal scaling for various M etropolis- H astings algorithms. Statistical science , 16(4):351--367
work page 2001
-
[22]
Roberts, G. O. and Tweedie, R. L. (1996). Exponential convergence of L angevin distributions and their discrete approximations. Bernoulli , 2(4):341--363
work page 1996
- [23]
-
[24]
Tak, H., Meng, X.-L., and van Dyk, D. A. (2018). A repelling–attracting metropolis algorithm for multimodality. Journal of Computational and Graphical Statistics , 27(3):479–490
work page 2018
-
[25]
Vats, D., Flegal, J. M., and Jones, G. L. (2019). Multivariate output analysis for M arkov chain M onte C arlo. Biometrika , 106(2):321--337
work page 2019
-
[26]
B., atuszy \'n ski, K., and Roberts, G
Vats, D., Gon c alves, F. B., atuszy \'n ski, K., and Roberts, G. O. (2022). Efficient bernoulli factory M arkov chain M onte C arlo for intractable posteriors. Biometrika , 109(2):369--385
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.