pith. sign in

arxiv: 2502.08157 · v2 · submitted 2025-02-12 · ✦ hep-ph · hep-ex· physics.data-an

Bring the noise: exact inference from noisy simulations in collider physics

Pith reviewed 2026-05-23 03:32 UTC · model grok-4.3

classification ✦ hep-ph hep-exphysics.data-an
keywords Monte Carlo simulationspseudo-marginal MCMCcollider physicsLHCPoisson likelihoodexact inferenceneutralino searchesBayesian inference
0
0 comments X

The pith

A pseudo-marginal MCMC method returns exact inferences from noisy Monte Carlo simulations in collider physics.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a technique to obtain exact Bayesian inferences even when the likelihood is estimated with noisy Monte Carlo simulations. It does this by constructing an unbiased estimator for a Poisson likelihood and embedding it in a pseudo-marginal MCMC framework. This allows researchers at the LHC and similar experiments to use their standard simulation tools without introducing approximation errors into the final results. Sympathetic readers would care because current methods accept some bias from finite simulation statistics, while this approach removes that bias at comparable computational cost.

Core claim

The central claim is that an unbiased estimator for the Poisson likelihood, constructed by drawing a Poisson-distributed number of Monte Carlo events, can be used within pseudo-marginal MCMC to produce exact posterior inferences despite the noise in the simulations. The method is demonstrated on a simplified model search for neutralinos and charginos at the LHC, showing that exact inferences are obtained for a similar computational cost to approximate ones and that the results are robust with respect to the number of events generated per point.

What carries the argument

An unbiased estimator for a Poisson likelihood that uses a Poisson-distributed number of Monte Carlo events, placed inside pseudo-marginal MCMC.

If this is right

  • Exact inferences are obtained for a similar computational cost to approximate ones from existing methods.
  • Inferences remain robust with respect to the number of events generated per point.
  • The unbiased estimator uses a Poisson-distributed number of MC events.
  • A biased estimator is also possible whose bias decays factorially with increasing number of MC events.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar techniques could be applied to other fields that rely on Monte Carlo simulations for likelihoods, such as cosmology or epidemiology.
  • If widely adopted, this would allow experimental collaborations to reduce the number of simulated events needed for reliable results.
  • The method preserves the exactness property without introducing new convergence issues, suggesting it can be integrated into existing analysis pipelines.

Load-bearing premise

The Monte Carlo simulations can produce estimators that are made unbiased for the Poisson likelihood by using a Poisson-distributed number of events.

What would settle it

Running the MCMC with the new estimator on a problem where the true posterior is known exactly from analytic calculation or infinite statistics, and checking whether the recovered posterior matches within sampling error.

Figures

Figures reproduced from arXiv: 2502.08157 by Anders Kvellestad, Andrew Fowlie, Benjamin Farmer, Christopher Chang.

Figure 1
Figure 1. Figure 1: FIG. 1. The posterior pdf reconstructed by MCMC for the unknown [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. The posterior pdf for the selection efficiency from the MLE [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Effective number of posterior samples for the unknown [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Posterior pdf for the [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
read the original abstract

We rely on Monte Carlo (MC) simulations to interpret searches for new physics at the Large Hadron Collider (LHC) and elsewhere. These simulations result in noisy and approximate estimators of selection efficiencies and likelihoods. In this context we pioneer an exact-approximate computational method - exact-approximate Markov Chain Monte Carlo (MCMC), also known as pseudo-marginal MCMC - that returns exact inferences despite noisy simulations. To do so, we introduce an unbiased estimator for a Poisson likelihood. We demonstrate the new estimator and new techniques in examples based on a search for neutralinos and charginos at the LHC using a simplified model. We find attractive performance characteristics - exact inferences are obtained for a similar computational cost to approximate ones from existing methods and inferences are robust with respect to the number of events generated per point. The unbiased estimator uses a Poisson-distributed number of MC events; it is also possible to construct a biased estimator whose bias decays factorially with increasing number of MC events.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The paper claims to pioneer the use of pseudo-marginal MCMC (exact-approximate MCMC) for collider physics by constructing an unbiased estimator of the Poisson likelihood from Monte Carlo simulations that employ a Poisson-distributed number of events. This enables exact Bayesian inferences despite noisy simulations. The method is demonstrated on a simplified model for neutralino/chargino searches at the LHC, with reported performance that matches the cost of approximate methods while remaining robust to the number of events generated per parameter point. A secondary biased estimator with factorial bias decay is also presented.

Significance. If the unbiasedness construction and its integration into pseudo-marginal MCMC hold, the work supplies a practical route to exact inference in LHC analyses that avoids the systematic biases from finite-MC approximations. Credit is due for the explicit estimator, the direct calculation of its expectation, and the numerical demonstrations on a realistic search channel. The robustness result and the factorial-bias alternative are useful additions that could affect how statistical interpretations are performed in future searches.

minor comments (2)
  1. [Abstract] Abstract and §2: the phrase 'exact inferences' should be accompanied by a brief qualifier that exactness holds for the target posterior in the limit of infinite MCMC iterations, to prevent misreading by experimental readers.
  2. The manuscript would benefit from an explicit statement in the methods section of how the Poisson number of events is implemented in standard event generators that normally produce fixed-N samples.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, recognition of the method's potential utility for LHC analyses, and recommendation for minor revision. We appreciate the credit given to the unbiased estimator construction, its integration with pseudo-marginal MCMC, and the numerical demonstrations.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper constructs an explicit unbiased estimator for the Poisson likelihood by drawing a Poisson-distributed number of Monte Carlo events and shows via direct expectation calculation that its mean recovers the target likelihood. This estimator is then inserted into the standard pseudo-marginal MCMC framework. No step reduces by definition to a fitted parameter, no self-citation supplies a load-bearing uniqueness theorem, and the central claim (exactness preserved under the unbiased estimator) is verified by algebraic expectation rather than by renaming or circular re-use of the target quantity. The numerical demonstrations on the neutralino/chargino example serve only as validation, not as the source of the estimator itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review based on abstract only; ledger reflects limited visibility into assumptions. Relies on standard MCMC convergence properties and the existence of an unbiased estimator construction.

axioms (1)
  • standard math Markov chains constructed with unbiased likelihood estimators converge to the correct posterior
    Core property of pseudo-marginal MCMC invoked by the method.

pith-pipeline@v0.9.0 · 5707 in / 1178 out tokens · 28702 ms · 2026-05-23T03:32:25.809436+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Many Wrongs Make a Right: Leveraging Biased Simulations Towards Unbiased Parameter Inference

    hep-ph 2026-04 unverdicted novelty 7.0

    Template-Adapted Mixture Model uses many biased simulations for data-driven estimates of signal and background distributions, yielding unbiased signal fraction estimates with well-calibrated uncertainties.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · cited by 1 Pith paper · 20 internal anchors

  1. [1]

    That is, on average, the estimator equals the exact likelihood

    The estimator is unbiased (at least up to a constant factor), 〈 ˆL〉 = C L (1) for constant C . That is, on average, the estimator equals the exact likelihood

  2. [2]

    Bring the noise: exact inference from noisy simulations in collider physics

    The estimator is never negative, ˆL ≥ 0. (2) 2 Though note that other algorithms have exact-approximate variants, e.g., pseudo-marginal slice-sampling [10]. arXiv:2502.08157v1 [hep-ph] 12 Feb 2025 2 We later discuss adaptations of MCMC that remove this requirement. If the estimator satisfies these requirements, the MCMC algo- rithm converges to the exactl...

  3. [3]

    We take a flat prior on ϵ and fix the cross section to σ = 1000 fb

    First, we construct a simple one-dimensional model in which the selection efficiency, ϵ, is an input parameter. We take a flat prior on ϵ and fix the cross section to σ = 1000 fb

  4. [4]

    We fix the production cross section to σ = 1000fb and compute a selection efficiency for the SRWZ_15 signal region as a function of (m1,m2) through SModelS [30]

    Second, we construct a simplified model based on the TChiWZ topology: mass-degenerate χ± 1 and χ2 particles are pair produced and decay exclusively through χ± 1 → W χ1 and χ2 → Z χ1, respectively. We fix the production cross section to σ = 1000fb and compute a selection efficiency for the SRWZ_15 signal region as a function of (m1,m2) through SModelS [30]...

  5. [5]

    Lastly, we use the same simplified model, but compute the production cross-section σ(pp → χ2 χ± 1 ) as a func- tion of (m1,m2) using tabulated wino-like neutralino- chargino production cross-sections [31–33]. The tabulated cross sections were computed at NLO + NLL precision assuming mass-degenerate wino-like χ2 and χ± 1 , and a bino-like χ1 and use an env...

  6. [6]

    AbdusSalam et al., Simple and statistically sound recommendations for analysing physical theories, Rept

    S.S. AbdusSalam et al., Simple and statistically sound recommendations for analysing physical theories, Rept. Prog. Phys. 85 (2022) 052201 [2012.09874]

  7. [7]

    Brehmer and K

    J. Brehmer and K. Cranmer, Simulation-based inference methods for particle physics, in Artificial Intelligence for High Energy Physics, pp. 579–611 (2020), DOI [2010.06439]

  8. [8]

    Matrix Element Method in HEP: Transfer Functions, Efficiencies, and Likelihood Normalization

    I. Volobouev, Matrix Element Method in HEP: Transfer Functions, Efficiencies, and Likelihood Normalization, 1101.2259

  9. [9]

    An Introduction to PYTHIA 8.2

    T . Sjöstrand, S. Ask, J.R. Christiansen, R. Corke, N. Desai, P . Ilten et al.,An introduction to PYTHIA 8.2, Comput. Phys. Commun. 191 (2015) 159 [1410.3012]

  10. [10]

    Herwig++ Physics and Manual

    M. Bahr et al., Herwig++ Physics and Manual, Eur . Phys. J. C58 (2008) 639 [0803.0883]

  11. [11]

    Bothmann et al.,Event generation with Sherpa 2.2, SciPost Phys.7(2019) 034, arXiv:1905.09127 [hep-ph]

    S HERPA collaboration, Event Generation with Sherpa 2.2, SciPost Phys. 7 (2019) 034 [1905.09127]

  12. [12]

    DELPHES 3 collaboration, DELPHES 3, A modular framework for fast simulation of a generic collider experiment, JHEP 02 (2014) 057 [1307.6346]

  13. [13]

    R. Brun, F . Bruyant, F . Carminati, S. Giani, M. Maire, A. McPherson et al., GEANT Detector Description and Simulation Tool, CERN Program Library Long Writeup (1994)

  14. [14]

    Brooks, A

    S. Brooks, A. Gelman, G. Jones and X. Meng, Handbook of Markov Chain Monte Carlo, CRC Press, United States (May, 2011)

  15. [15]

    Pseudo-Marginal Slice Sampling

    I. Murray and M.M. Graham, Pseudo-Marginal Slice Sampling, in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, A. Gretton and C.C. Robert, eds., vol. 51 of Proc. Mach. Learn. Res., (Cadiz, Spain), pp. 911–919, Oct., 2016 [1510.02958]

  16. [16]

    GAMBIT collaboration, ColliderBit: a GAMBIT module for the calculation of high-energy collider observables and likelihoods, Eur . Phys. J. C77 (2017) 795 [1705.07919]

  17. [17]

    GAMBIT collaboration, GAMBIT: The Global and Modular Beyond-the-Standard-Model Inference Tool, Eur . Phys. J. C77 (2017) 784 [1705.07908]

  18. [18]

    Overview of Approximate Bayesian Computation

    S.A. Sisson, Y. Fan and M.A. Beaumont, Overview of Approximate Bayesian Computation, in Handbook of Approximate Bayesian Computation, Chapman and Hall/CRC (2018), DOI [1802.09720]

  19. [19]

    Beaumont, Estimation of Population Growth or Decline in Genetically Monitored Populations, Genetics 164 (2003) 1139

    M.A. Beaumont, Estimation of Population Growth or Decline in Genetically Monitored Populations, Genetics 164 (2003) 1139

  20. [20]

    Møller, A.N

    J. Møller, A.N. Pettitt, R. Reeves and K.K. Berthelsen, An efficient Markov chain Monte Carlo method for distributions with intractable normalising constants, Biometrika 93 (2006) 451

  21. [21]

    The pseudo-marginal approach for efficient Monte Carlo computations

    C. Andrieu and G.O. Roberts, The pseudo-marginal approach for efficient Monte Carlo computations, Ann. Stat. 37 (2009) [0903.5480]

  22. [22]

    A Noisy Monte Carlo Algorithm

    L. Lin, K.F . Liu and J.H. Sloan,A Stochastic Monte Carlo algorithm, Phys. Rev. D 61 (2000) 074505 [hep-lat/9905033]

  23. [23]

    Dembinski and M

    H. Dembinski and M. Schmelling, Bias, variance, and confidence intervals for efficiency estimators in particle physics experiments, 2110.00294

  24. [24]

    Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator

    A. Doucet, M. Pitt, G. Deligiannidis and R. Kohn, Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator, Biometrika 102 (2015) 295 [1210.1871]

  25. [25]

    On the efficiency of pseudo-marginal random walk Metropolis algorithms

    C. Sherlock, A.H. Thiery, G.O. Roberts and J.S. Rosenthal, On the efficiency of pseudo-marginal random walk Metropolis algorithms, Ann. Stat. 43 (2015) [ 1309.7209]

  26. [26]

    Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms

    C. Andrieu and M. Vihola, Convergence properties of pseudo-marginal Markov chain Monte Carlo algorithms, Ann. Appl. Probab. 25 (2015) [ 1210.1484]

  27. [27]

    Optimal scaling for the pseudo-marginal random walk Metropolis: insensitivity to the noise generating mechanism

    C. Sherlock, Optimal Scaling for the Pseudo-Marginal Random Walk Metropolis: Insensitivity to the Noise Generating Mechanism, Methodol. Comput. In Appl. Probab. 18 (2015) 869 [1408.4344]

  28. [28]

    Glasser, Minimum Variance Unbiased Estimators for Poisson Probabilities, Technometrics 4 (1962) 409

    G.J. Glasser, Minimum Variance Unbiased Estimators for Poisson Probabilities, Technometrics 4 (1962) 409

  29. [29]

    A.-M. Lyne, M. Girolami, Y. Atchadé, H. Strathmann and D. Simpson, On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods, Stat. Sci. 30 (2015) [ 1306.4032]

  30. [30]

    Establishing some order amongst exact approximations of MCMCs

    C. Andrieu and M. Vihola, Establishing some order amongst exact approximations of MCMCs, Ann. Appl. Probab. 26 (2016) [1404.6909]

  31. [31]

    Goodman and J

    J. Goodman and J. Weare, Ensemble samplers with affine invariance, Commun. In Appl. Math. Comput. Sci. 5 (2010) 65

  32. [32]

    emcee: The MCMC Hammer

    D. Foreman-Mackey, D.W . Hogg, D. Lang and J. Goodman, emcee: The MCMC Hammer, Publ. Astron. Soc. Pac. 125 (2013) 306 [1202.3665]

  33. [33]

    Kumar, C

    R. Kumar, C. Carroll, A. Hartikainen and O. Martin, ArviZ a unified library for exploratory analysis of Bayesian models in Python, J. Open Source Softw. 4 (2019) 1143

  34. [34]

    ATLAS collaboration, Search for chargino–neutralino pair production in final states with three leptons and missing transverse momentum in ps = 13 TeV pp collisions with the ATLAS detector, Eur . Phys. J. C81 (2021) 1118 [2106.01676]

  35. [35]

    Alguero, J

    G. Alguero, J. Heisig, C.K. Khosa, S. Kraml, S. Kulkarni, A. Lessa et al., Constraining new physics with SModelS version 2, JHEP 08 (2022) 068 [2112.00769]

  36. [36]

    LHC SUSY Cross Section Working Group, NLO-NLL wino-like chargino-neutralino (N2C1) cross sections, 2017

  37. [37]

    B. Fuks, M. Klasen, D.R. Lamprea and M. Rothering, Gaugino production in proton-proton collisions at a center-of-mass energy of 8 TeV, JHEP 10 (2012) 081 [1207.2159]

  38. [38]

    B. Fuks, M. Klasen, D.R. Lamprea and M. Rothering, Precision predictions for electroweak superpartner production at hadron colliders with RESUMMINO , Eur . Phys. J. C73 (2013) 2480 [1304.0790]

  39. [39]

    On nonnegative unbiased estimators

    P .E. Jacob and A.H. Thiery,On nonnegative unbiased estimators, Ann. Stat. 43 (2015) 769 [ 1309.6473]

  40. [40]

    Bias in parametric estimation: reduction and useful side-effects

    I. Kosmidis, Bias in parametric estimation: reduction and useful side-effects, WIREs Comput. Stat. 6 (2014) 185–196 [1311.6311]

  41. [41]

    Zhenhua, Advanced Statistical Theory I, Lecture Notes (2019)

    L. Zhenhua, Advanced Statistical Theory I, Lecture Notes (2019)

  42. [42]

    Jakob, J

    W . Jakob, J. Rhinelander, D. Moldovan and others,pybind11 – Seamless operability between C++11 and Python, 2017. 8 Appendix A: Proof that estimator is UMVUE We wish to show that our estimator is the uniform minimum variance unbiased estimator (UMVUE). That is, it is expected to equal the true likelihood when k ∼ Po(ϵnMC), and that the variance of our est...

  43. [43]

    ideal.hpp

    C++ implementation We provide a simple header-only library ideal.hpp providing the function: double umvue_poisson_like(int k, double b, int o, int n_mc, double n_exp) in the namespace ideal. This function returns an unbiased estimate of a Poisson likelihood when k events were simulated from an expected n_mc simulations, and o events were observed in a sam...

  44. [44]

    __main__

    Python bindings We supply Python bindings for the C++ functions umvue_poisson_like and umvue_draw_n_mc using pybind11 [37]. The ideal module can be built and installed by make. The equivalent example program would be: 1 import ideal 2 3 if __name__ == "__main__": 4 print(ideal.umvue_poisson_like(1000, 10, 20, 10000, 100)) 5 print(ideal.umvue_draw_n_mc(1000))

  45. [45]

    UMVUE" # by default,

    Implementation in GAMBIT We use code similar to that in Appendix C 1 as part of the ColliderBit module of GAMBIT and implement an option to enable the UMVUE. The UMVUE may be turned on by setting the estimator in the rules governing the MC generation of collider events in the yaml input file: 1 Rules: 2 - capability: RunMC 3 function: operateLHCLoop 4 opt...