Confidence, Statistical Evidence and Relative Belief with Applications to a Problem in Particle Physics
Pith reviewed 2026-06-27 13:57 UTC · model grok-4.3
The pith
Relative belief inferences satisfy the principle of evidence and achieve frequentist confidence levels for intervals in the Poisson model used in particle physics.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Relative belief inferences satisfy the principle of evidence and, when the errors in these inferences are controlled, also satisfy repeated sampling requirements such as achieving given confidence levels for intervals in the Poisson signal-plus-background model.
What carries the argument
Relative belief inferences based on the relative belief ratio, which orders parameter values by the principle of evidence without requiring a prior.
If this is right
- Intervals for the Poisson signal-plus-background model can be constructed that both respect the ordering of evidence and attain specified confidence levels.
- The method supplies uncertainty quantification that meets both evidence and repeated-sampling criteria in the particle-physics setting.
- These intervals stand as a direct alternative to Feldman-Cousins intervals for the same Poisson problem.
- Error control on relative belief inferences yields frequentist validity without introducing a prior distribution.
Where Pith is reading between the lines
- The same error-control technique might be applied to other counting models that arise in physics experiments.
- Computational procedures for controlling the errors could be developed once for a family of discrete distributions rather than case by case.
- Reporting both the evidence ordering and the achieved coverage could become a standard practice for interval estimation in high-energy physics analyses.
Load-bearing premise
The errors in relative belief inferences can be controlled in a manner that delivers the stated frequentist confidence levels for the Poisson signal-plus-background model without further model-specific assumptions.
What would settle it
A Monte Carlo simulation that checks whether the relative belief intervals attain the nominal coverage probability for the true signal strength when data are repeatedly drawn from the Poisson signal-plus-background distribution.
Figures
read the original abstract
Probability theory provides a clear definition of what is meant by evidence in favor, against or none either way, of an event occurring for an unobserved response, via the principle of evidence. This is immediately applicable when carrying out a proper Bayesian analysis. Even without a prior, this imposes restrictions on reported inferences as these need to reflect the likelihood ordering. Relative belief inferences satisfy this requirement and, when the errors in these inferences are controlled, they also satisfy repeated sampling, or frequentist, requirements such as achieving given confidence levels. Relative belief inferences are considered here for the construction of intervals for uncertainty quantification in the context of a Poisson model for a signal with background noise. These intervals are contrasted with the well-known Feldman-Cousins intervals for this problem.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the principle of evidence from probability theory restricts inferences to respect likelihood ordering, that relative belief (RB) inferences satisfy this principle, and that when errors in RB inferences are controlled they also achieve repeated-sampling properties such as exact or conservative frequentist coverage. It develops RB interval constructions for a Poisson signal-plus-background model and contrasts them with Feldman-Cousins intervals.
Significance. If a general, assumption-light error-control procedure for RB intervals can be shown to deliver the stated frequentist coverage in the Poisson model, the work would supply a coherent bridge between evidence-based ordering and frequentist guarantees, which is of direct interest for uncertainty quantification in particle-physics searches where Feldman-Cousins is the current standard.
major comments (2)
- [Abstract, §3] Abstract and §3 (Poisson application): the central claim that 'when the errors in these inferences are controlled' the RB intervals achieve given confidence levels is load-bearing, yet the manuscript provides only illustrative numerical comparisons with Feldman-Cousins and does not derive or demonstrate a general error-control rule that produces exact or conservative coverage for all signal strengths, background rates, and observation regimes without further model-specific tuning.
- [§3] §3, discussion of coverage: the paper asserts that RB intervals can be made to satisfy repeated-sampling requirements, but no table or figure reports empirical coverage probabilities over a grid of true signal values; without such verification the frequentist guarantee does not follow from the principle of evidence alone.
minor comments (2)
- Notation for the relative-belief ratio and the error-control threshold should be introduced once with a single symbol and used consistently thereafter.
- The manuscript would benefit from an explicit statement of the precise frequentist coverage target (e.g., exact 95 % or conservative) that the error-control procedure is intended to achieve.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major comment below, with revisions where the manuscript requires clarification or additional support.
read point-by-point responses
-
Referee: [Abstract, §3] Abstract and §3 (Poisson application): the central claim that 'when the errors in these inferences are controlled' the RB intervals achieve given confidence levels is load-bearing, yet the manuscript provides only illustrative numerical comparisons with Feldman-Cousins and does not derive or demonstrate a general error-control rule that produces exact or conservative coverage for all signal strengths, background rates, and observation regimes without further model-specific tuning.
Authors: The manuscript focuses on the Poisson signal-plus-background model and does not claim or derive a general error-control rule that applies without model-specific tuning across arbitrary regimes. The claim is that relative belief inferences satisfy the principle of evidence and, when errors are controlled in this setting, the resulting intervals achieve the stated frequentist properties, as shown through the direct numerical comparisons with Feldman-Cousins. We will revise the abstract and §3 to make the model-specific scope of the error control and frequentist results explicit. revision: partial
-
Referee: [§3] §3, discussion of coverage: the paper asserts that RB intervals can be made to satisfy repeated-sampling requirements, but no table or figure reports empirical coverage probabilities over a grid of true signal values; without such verification the frequentist guarantee does not follow from the principle of evidence alone.
Authors: The referee correctly notes that the manuscript contains no table or figure reporting empirical coverage probabilities over a grid of true signal values. The frequentist properties are illustrated via targeted comparisons rather than a systematic coverage study. We will add a figure or table with coverage results over a range of signal strengths in the revised manuscript. revision: yes
Circularity Check
No circularity: relative belief properties and error-controlled frequentist coverage presented as independent consequences without self-referential definitions or load-bearing self-citations in provided text
full rationale
The abstract states that relative belief inferences satisfy the principle of evidence by construction from likelihood ordering and, separately, that error control allows them to meet frequentist confidence levels in the Poisson model. No equations, fitted parameters, or self-citations are exhibited that would reduce the coverage claim to a definition or prior result by the same authors. The derivation chain is not shown to collapse by construction; the error-control step is described as an additional requirement rather than tautological. This is the expected honest non-finding when no specific reduction (e.g., Eq. X defined via the coverage it claims to achieve) can be quoted.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Principle of evidence provides a clear definition of evidence in favor, against, or neutral for an event
- domain assumption Errors in relative belief inferences can be controlled to achieve given confidence levels
Reference graph
Works this paper leans on
-
[1]
This is immediately applicable when carrying out a proper Bayesian analysis
Confidence, Statistical Evidence and Relative Belief with Applications to a Problem in Particle Physics Michael Evans∗and Siqi Zheng † Department of Statistical Sciences, University of Toronto Abstract Probability theory provides a clear definition of what is meant by evidence in favor, against or none either way, of an event occurring for an unobserved r...
Pith/arXiv arXiv 1930
-
[2]
Certainly, the consideration of such error is essential as it is a measure of the reliability of the inference being quoted
This criticism is not surprising because the concept of confidence itself was not designed to reflect evidence, but rather confidence regions are used to measure the error in an estimate lying within the region. Certainly, the consideration of such error is essential as it is a measure of the reliability of the inference being quoted. As will be shown in ...
1986
-
[3]
because the observed data is more probable whenψ 1 is the true value than whenψ 2 is the true value. The likelihood ordering seems very natural and Theorem 1 in Section 2.2 implies that any region quoted as a candidate to contain the true value, must respect the ordering and so be a likelihood region. To use the likelihood as a basis for inference and the...
1998
-
[4]
Such an outcome is commonly considered as an absurdity and a region that exhibits such behavior is calledimproperorabsurd
Notice thatC(1) =C(2) = Θ, the whole parameter space. Such an outcome is commonly considered as an absurdity and a region that exhibits such behavior is calledimproperorabsurd. The reason for this is that, ifC(1) orC(2) is stated, then it is categorically known that the true value is in the set and the confidence level 11/12 seems irrelevant. It has been ...
2024
-
[5]
One is that it is silent about which interval to quote. It is reasonable to answer, however, that this is not a problem when a prior on ψis provided, as with relative belief, and so this is a problem that other approaches to inference have to deal with. A more serious concern is how to obtain the inference baseI Ψ for a marginal parameter? This is not a p...
2006
-
[6]
If (7) is large, this indicates that not finding evidence in favor ofH 0,based on the observed data will happen with high prior probability, when H0 is true
denotes the conditional prior distribution of the data given thatH 0 is true, namely, the nuisance parameters have been integrated out. If (7) is large, this indicates that not finding evidence in favor ofH 0,based on the observed data will happen with high prior probability, when H0 is true. As such, it cannot be claimed that finding evidence against is ...
2015
-
[7]
For the estimation problems the biases are obtained by averaging these biases overλ 0 with respect to the prior (now placed onλ 0)
= 1− X {t:RB(λ0 |t)>1} nt(b+λ 0)t t! e−n(b+λ0), bias in favor(λ0) = sup λ:|λ0−λ|≥δ/2 MT (RBΨ(λ0 |t)≥1|λ) = sup λ∈{λ0−δ/2,λ0+δ/2} X {t:RB(λ0 |t)≥1} nt(b+λ) t t! e−n(b+λ), whereδis the difference that matters. For the estimation problems the biases are obtained by averaging these biases overλ 0 with respect to the prior (now placed onλ 0). Consider now impl...
2025
-
[8]
With the background so much greater than the signal, it is hard to discern the signal, at least with such a small sample size
The strength of this evidence isStr(0|x) = 0.18, so there is moderate evidence in favor, but certainly not worth claiming thatH 0 is true. With the background so much greater than the signal, it is hard to discern the signal, at least with such a small sample size. The resulting plausible interval is (0,1.87), and it contains 92.3% of the posterior probab...
1999
-
[9]
Left panel: prior onλ, prior onb, posterior density, and plausible interval forλ. Right panel: Relative belief ratio forλ where the horizontal dashed line at 1 marks the evidence cutoff and plausible interval. 4 Code Availability The methods described in this paper are implemented in the Python packagerbinfer, freely available at https://github.com/siqi-z...
-
[10]
Teo, Y.S., Jeong, H., Prasannan, N., Brecht, B., Silberhorn, C., Evans, M., Mogilevtsev, D
DOI: 10.1103/PhysRevA.110.012231. Teo, Y.S., Jeong, H., Prasannan, N., Brecht, B., Silberhorn, C., Evans, M., Mogilevtsev, D. and Sanchez-Soto, L.L. (2024b) Evidence-based certification of quantum dimensions. Physical Review Letters 133, 050204, DOI: 10.1103/PhysRevLett.133.050204. 21 Belief= 0.75 UL init UL 1 Cont 1 UL 2 Cont 2 RB Λ (0|x)Str Λ (0|x) 5.0 ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.