Strong Likelihood Principle: Strengthening a Principle or Misunderstanding the Likelihood Function
Pith reviewed 2026-06-27 14:21 UTC · model grok-4.3
The pith
The strong likelihood principle collapses to the weak one once the likelihood function is defined on the family of distributions M rather than a parameter space.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
When the likelihood function is defined on the family of distributions M rather than on a parameter space, the strong likelihood principle collapses into the weak likelihood principle. The paper illustrates this by analogy with monetary value and develops the claim through the binomial and negative binomial families sharing a parameter, connecting the result to the geometric structure of M via the Fisher information metric. The same standardization arises from a statistical argument about comparing measurements across populations and from a geometric argument about manifold distance, supplying the positive content of the weak likelihood principle.
What carries the argument
The domain of the likelihood function as the family of distributions M, which forces the strong likelihood principle to reduce to the weak likelihood principle.
If this is right
- The strong likelihood principle supplies no additional inferential content beyond the weak likelihood principle.
- Likelihood comparisons across sampling models that share a parameter become standardized by the geometry of the family M.
- The weak likelihood principle acquires positive content from both statistical measurement arguments and manifold-distance arguments.
- Birnbaum's derivation of the strong principle from sufficiency and conditionality reflects a confusion about the likelihood's domain.
Where Pith is reading between the lines
- The geometric standardization on M could be tested by checking whether likelihood-based inferences remain invariant under reparameterizations that preserve the family structure.
- This domain clarification may apply to other likelihood-based principles such as those involving profile likelihoods or marginal likelihoods.
- The convergence of statistical and geometric arguments suggests examining whether Fisher information distances directly quantify the standardization needed for cross-population measurements.
Load-bearing premise
The likelihood function is naturally defined as a function on a family of distributions M rather than on a parameter space.
What would settle it
An explicit pair of experiments, one binomial and one negative binomial, sharing the same parameter value, where the likelihood ratio computed on M fails to match the standardized comparison required by the weak principle.
Figures
read the original abstract
The strong likelihood principle (SLP) is conventionally derived from the sufficiency principle and a conditionality principle in an argument due to Birnbaum, and much of the literature contests whether the derivation is sound. We take a different approach. We ask what the SLP says when its terms are read carefully, and argue that the principle as ordinarily stated reflects a confusion about the domain of the likelihood function. The likelihood is naturally defined as a function on a family of distributions $M$, not on a parameter space, and once it is so defined the SLP collapses into its weak counterpart, the weak likelihood principle. The diagnosis is illustrated by analogy with monetary value, developed concretely through a comparison of the binomial and negative binomial families that share a parameter, and connected to the geometric structure of $M$ through the Fisher information metric. The same standardization emerges from a statistical argument about comparing measurements across populations and from a geometric argument about manifold distance; this convergence supplies the positive content of the weak likelihood principle.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that the strong likelihood principle (SLP), conventionally derived from sufficiency and conditionality, reflects a confusion about the domain of the likelihood function. The likelihood is naturally a function on the family of distributions M rather than a parameter space; once redefined on M, the SLP collapses into the weak likelihood principle (WLP). This is illustrated via an analogy to monetary value, a concrete binomial versus negative-binomial comparison sharing a parameter, and connections to the Fisher information metric on M. Convergence of statistical arguments about cross-population measurement and geometric manifold-distance arguments supplies positive content for the WLP.
Significance. If the central claim holds, the paper would reframe decades of debate on Birnbaum's derivation as a domain-specification issue rather than a question of logical validity, potentially redirecting foundational statistics toward the geometric structure of likelihood. The explicit convergence of definitional, statistical, and geometric lines of argument is a strength that supplies independent motivation for the WLP. The result would be significant for the literature on likelihood principles if accompanied by a general demonstration that the domain shift preserves the intended content of SLP statements.
major comments (3)
- [Abstract] Abstract and opening sections: the claim that 'once it is so defined the SLP collapses into its weak counterpart' is not supported by an explicit logical mapping or derivation showing that every standard SLP statement becomes equivalent to the corresponding WLP statement under the domain M; the reduction therefore risks being definitional by construction rather than exposing an internal error in the original formulation.
- [Binomial/negative binomial comparison] Binomial/negative-binomial comparison (the concrete illustration section): while the example shows that the same numerical parameter can label distinct elements of M, it does not establish that the SLP-to-WLP collapse holds for arbitrary sampling models or that the original SLP statements are thereby rendered identical to WLP statements; a general argument is required for the central claim.
- [Geometric argument] Geometric argument via Fisher metric: the connection to manifold distance supplies motivation for standardization on M, but does not by itself demonstrate the logical collapse of the SLP; the paper must show how this geometric fact entails equivalence of the two principles rather than merely consistency with the WLP.
minor comments (2)
- The monetary-value analogy is suggestive but would benefit from a short numerical table contrasting 'value' under different units to parallel the binomial/negative-binomial case.
- Notation for the family M and the likelihood map L:M o R should be introduced with a formal definition early in the text to avoid ambiguity when contrasting with the conventional L( heta;x).
Simulated Author's Rebuttal
We thank the referee for the thoughtful and detailed report. The comments correctly identify places where the logical steps from domain redefinition to the collapse of the SLP can be made more explicit. We will revise the manuscript to supply the requested mappings and general arguments while preserving the paper's core thesis that the domain of the likelihood is M.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening sections: the claim that 'once it is so defined the SLP collapses into its weak counterpart' is not supported by an explicit logical mapping or derivation showing that every standard SLP statement becomes equivalent to the corresponding WLP statement under the domain M; the reduction therefore risks being definitional by construction rather than exposing an internal error in the original formulation.
Authors: We accept that an explicit logical mapping would strengthen the presentation and prevent any appearance of definitional collapse. In the revised version we will insert a new subsection that takes standard formulations of the SLP (including Birnbaum's sufficiency-plus-conditionality derivation) and shows, statement by statement, how each becomes a WLP statement once the likelihood is treated as a function on M rather than on a shared parameter space. The mapping rests on the observation that distinct sampling models correspond to distinct points of M, so cross-model likelihood ratios are undefined; this is an internal consequence of the domain choice rather than an external stipulation. revision: yes
-
Referee: [Binomial/negative binomial comparison] Binomial/negative-binomial comparison (the concrete illustration section): while the example shows that the same numerical parameter can label distinct elements of M, it does not establish that the SLP-to-WLP collapse holds for arbitrary sampling models or that the original SLP statements are thereby rendered identical to WLP statements; a general argument is required for the central claim.
Authors: The binomial-negative-binomial comparison is offered only as a concrete illustration of how a shared numerical parameter can index distinct elements of M. The general argument is the domain redefinition itself, which applies to any pair of sampling models. To address the request for an explicit general demonstration, the revision will add a short section that considers arbitrary distinct models M1 and M2 sharing a parameter label and shows that any SLP claim comparing L_M1 and L_M2 is ill-formed, reducing directly to separate WLP claims within each M. The concrete example will be retained as motivation. revision: yes
-
Referee: [Geometric argument] Geometric argument via Fisher metric: the connection to manifold distance supplies motivation for standardization on M, but does not by itself demonstrate the logical collapse of the SLP; the paper must show how this geometric fact entails equivalence of the two principles rather than merely consistency with the WLP.
Authors: We agree that the Fisher-metric argument is not presented as a standalone proof of the collapse; it is one of three independent lines (definitional, statistical, geometric) that converge on standardization to M. The logical collapse is derived from the domain redefinition. In revision we will add an explicit clarifying sentence stating that the geometric construction supplies supporting structure and independent motivation for the WLP without being claimed to entail the equivalence by itself. revision: yes
Circularity Check
Central claim of SLP collapse follows directly from redefinition of likelihood domain
specific steps
-
self definitional
[Abstract]
"The likelihood is naturally defined as a function on a family of distributions $M$, not on a parameter space, and once it is so defined the SLP collapses into its weak counterpart, the weak likelihood principle."
The argument asserts that the SLP as ordinarily stated reflects confusion about the domain, and that redefining likelihood as a map on M makes SLP identical to WLP. The collapse is presented as following immediately from the domain change, rendering the reduction definitional by construction rather than a derived equivalence independent of the redefinition.
full rationale
The paper's main thesis is that careful reading of the SLP reveals a domain confusion, leading to its collapse into the WLP upon correcting the domain to the family of distributions M. This is illustrated with examples and connected to geometric structure. While supporting arguments from statistics and geometry provide independent content for the WLP, the specific claim that SLP collapses is tied to the definitional shift, creating moderate circularity in the diagnosis of the principle as a misunderstanding.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The likelihood function is naturally defined on the family of distributions M rather than on a parameter space.
Reference graph
Works this paper leans on
-
[1]
Barnard, G. A. and Sprott, D. A. (2006). Likelihood. In Encyclopedia of Statistical Sciences. John Wiley & Sons, New York. https://doi.org/10.1002/0471667196.ess1448.pub2
-
[2]
Berger, J. O. and Wolpert, R. L. (1988). The Likelihood Principle, 2nd ed. Lecture Notes---Monograph Series 6. IMS, Hayward, CA. https://doi.org/10.1214/lnms/1215466210
-
[3]
Birnbaum, A. (1962). On the foundations of statistical inference. J. Amer. Statist. Assoc. 57 269--306. https://doi.org/10.1080/01621459.1962.10480660
-
[4]
Cox, D. R. (1958). Some problems connected with statistical inference. Ann. Math. Statist. 29 357--372. https://doi.org/10.1214/aoms/1177706618
-
[5]
Cox, D. R. and Hinkley, D. V. (1974). Theoretical Statistics. Chapman & Hall, London
1974
-
[6]
Dawid, A. P. (2014). Discussion of ``On the Birnbaum argument for the strong likelihood principle.'' Statist. Sci. 29 240--241. https://doi.org/10.1214/14-STS470
-
[7]
Durbin, J. (1970). On Birnbaum's theorem on the relation between sufficiency, conditionality and likelihood. J. Amer. Statist. Assoc. 65 395--398. https://doi.org/10.1080/01621459.1970.10481088
-
[8]
Evans, M. J., Fraser, D. A. S. and Monette, G. (1986). On principles and arguments to likelihood. Canad. J. Statist. 14 181--199. https://doi.org/10.2307/3314794
-
[9]
Fraser, D. A. S. (1963). On the sufficiency and likelihood principles. J. Amer. Statist. Assoc. 58 641--647
1963
-
[10]
Kalbfleisch, J. D. (1975). Sufficiency and conditionality. Biometrika 62 251--268. https://doi.org/10.1093/biomet/62.2.251
-
[11]
Mayo, D. G. (2014). On the Birnbaum argument for the strong likelihood principle. Statist. Sci. 29 227--239. https://doi.org/10.1214/13-STS457
-
[12]
Vos, P. W. (2022). Generalized estimators, slope, efficiency, and Fisher information bounds. Information Geometry 7 151--170. https://doi.org/10.1007/s41884-022-00085-7
-
[13]
Vos, P. W. and Wu, Q. (2025). Generalized estimation and information. Information Geometry 8 99--123. https://doi.org/10.1007/s41884-025-00164-5
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.