Bayes factors with (overly) informative priors

Richard A Lockhart

arxiv: 1907.02473 · v2 · pith:RF4W47CNnew · submitted 2019-07-04 · 🧮 math.ST · stat.ME· stat.TH

Bayes factors with (overly) informative priors

Richard A Lockhart This is my paper

Pith reviewed 2026-05-25 08:38 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH

keywords Bayes factorsinformative priorsindependent parametersasymptotic analysismodel selectionBayesian inferenceprior specification

0 comments

The pith

Priors assuming independence among many parameters make Bayes factors insensitive to data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that priors treating a large number of parameters as independent prevent effective updating from data when using Bayes factors for model comparison. It draws on examples from published work and derives large-sample limits to demonstrate that the Bayes factor fails to reflect evidence in the data as the number of such parameters grows. A sympathetic reader would care because this undermines the reliability of Bayesian model selection in settings where independence assumptions are common. The argument rests on asymptotic analysis rather than finite-sample simulation.

Core claim

Priors in which a large number of parameters are specified to be independent are dangerous; they make it hard to learn from data. This is demonstrated by examples drawn from the literature together with large-sample theory that tracks the behavior of the Bayes factor under these priors.

What carries the argument

Large-sample asymptotic analysis of the Bayes factor, which tracks how the factor behaves when parameters are modeled as independent.

If this is right

Bayes factors may remain near 1 even when data strongly favor one model over another.
Increasing the number of independently specified parameters reduces the effective sample size for model comparison.
Model selection conclusions can become independent of the observed data in the large-sample limit.
Standard default priors in high-dimensional problems require explicit dependence structure to remain useful for Bayes factor calculations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The result implies that default independent priors should be avoided in any high-dimensional Bayesian model selection task.
One could test the claim by replacing independent priors with a joint prior that induces dependence and checking whether the Bayes factor then tracks the data.
The same mechanism may affect posterior predictive checks or marginal likelihood estimates in related Bayesian workflows.

Load-bearing premise

The large-sample asymptotic analysis of the Bayes factor under the stated priors accurately reflects finite-sample behavior and the cited literature examples are representative of typical modeling practice.

What would settle it

A finite-sample simulation or real-data example in which the Bayes factor under independent priors responds strongly to data, contrary to the asymptotic prediction that its value stabilizes away from the data-driven value.

read the original abstract

Priors in which a large number of parameters are specified to be independent are dangerous; they make it hard to learn from data. I present a couple of examples from the literature and work through a bit of large sample theory to show what happens.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Independent priors on many parameters can make Bayes factors ignore the data, shown via examples and asymptotics, but finite-sample checks are missing.

read the letter

The main point is that priors treating a large number of parameters as independent can make Bayes factors stop responding to data. Lockhart illustrates this with a couple of literature examples and some large-sample theory to show the marginal likelihoods behave in a way that swamps the likelihood contribution. This is a practical warning worth noting for anyone doing model selection with Bayes factors in higher-dimensional settings. The paper does a reasonable job of assembling the cases and walking through the asymptotics in plain terms, which makes the mechanism clear without overcomplicating things. It correctly flags a modeling habit that shows up often in variable selection or hierarchical setups. The soft spot is the reliance on asymptotics alone. There is no finite-sample simulation or direct verification that the limiting behavior shows up at the sample sizes in the cited examples, and the regularity conditions for the theory are not spelled out. This leaves some uncertainty about how general the claim is in real applications. The observation itself draws from existing literature rather than introducing a fresh derivation, so the value is more in the focused reminder than in new ground. This note is aimed at statisticians and machine learning practitioners who routinely use Bayes factors for model comparison. Someone already deep in prior sensitivity work will find it confirmatory, while others applying these tools might pick up a useful caution. The math is standard and the citations look appropriate. I would send it for peer review. The issue affects common practice and the presentation is direct enough to merit referee input even with the asymptotic focus.

Referee Report

2 major / 1 minor

Summary. The manuscript argues that priors specifying a large number of parameters as independent are dangerous because they render Bayes factors insensitive to data, making it difficult to learn from observations. It supports the claim by referencing examples from the literature and developing large-sample asymptotic theory for the behavior of Bayes factors under such priors.

Significance. If the asymptotic analysis is shown to apply under the relevant regularity conditions and the cited examples are representative, the result would provide a useful cautionary note on prior specification for Bayes factor model comparison in high-dimensional settings. The paper correctly identifies a potential mechanism by which independence assumptions across many parameters can dominate the marginal likelihood.

major comments (2)

[large sample theory development (abstract and main text)] The central claim rests on large-sample asymptotics for the Bayes factor, but the manuscript provides no finite-sample analysis, simulations, or explicit verification that the asymptotic regime governs the behavior in the cited literature examples at typical sample sizes (as flagged in the stress-test note). This is load-bearing because the abstract itself notes the argument is illustrated via 'a bit of large sample theory' without bridging to finite n.
[large sample theory section] Regularity conditions required for the asymptotic results (e.g., on likelihood smoothness, prior tail behavior, and model identifiability) are not stated, preventing assessment of whether they hold for the literature examples referenced.

minor comments (1)

The title is somewhat informal; a more precise phrasing such as 'On the effect of independent priors on Bayes factors in high dimensions' would better reflect the content.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed comments. Our manuscript is a short note highlighting a cautionary mechanism rather than a comprehensive study; we respond to each major comment below.

read point-by-point responses

Referee: [large sample theory development (abstract and main text)] The central claim rests on large-sample asymptotics for the Bayes factor, but the manuscript provides no finite-sample analysis, simulations, or explicit verification that the asymptotic regime governs the behavior in the cited literature examples at typical sample sizes (as flagged in the stress-test note). This is load-bearing because the abstract itself notes the argument is illustrated via 'a bit of large sample theory' without bridging to finite n.

Authors: The manuscript intentionally uses a light asymptotic argument plus literature examples to illustrate the mechanism, consistent with its short-note format and abstract wording. We agree that explicit finite-sample verification or simulations are absent and would strengthen the bridge to practice. In revision we will add a short paragraph discussing the sample sizes appearing in the cited examples and noting that the asymptotic divergence provides qualitative insight even for moderate n; this constitutes a partial revision without adding new simulations. revision: partial
Referee: [large sample theory section] Regularity conditions required for the asymptotic results (e.g., on likelihood smoothness, prior tail behavior, and model identifiability) are not stated, preventing assessment of whether they hold for the literature examples referenced.

Authors: We accept the point. The asymptotics rely on standard Laplace-type approximations to the marginal likelihood. In the revised manuscript we will insert an explicit paragraph listing the regularity conditions (twice continuous differentiability of the log-likelihood, positive-definite Fisher information, and standard prior tail decay) under which the stated large-sample behavior holds, allowing readers to judge applicability to the examples. revision: yes

Circularity Check

0 steps flagged

No circularity: claims rest on external literature examples and standard large-sample asymptotics

full rationale

The paper presents examples from the literature and applies standard large-sample theory to illustrate the effect of independent priors on Bayes factors. No load-bearing step reduces by construction to a fitted parameter, self-definition, or a self-citation chain; the asymptotics are invoked as external results rather than derived from the paper's own inputs. The derivation is therefore self-contained against external benchmarks, with no evidence of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the applicability of large-sample theory to Bayes factors and on the representativeness of two literature examples; no free parameters or new entities are introduced.

axioms (1)

domain assumption Standard large-sample asymptotic theory for Bayes factors applies under the stated prior constructions
The paper states it works through large-sample theory to show the effect.

pith-pipeline@v0.9.0 · 5545 in / 1064 out tokens · 32111 ms · 2026-05-25T08:38:27.272037+00:00 · methodology

Bayes factors with (overly) informative priors

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)