Progression to the mean: A comparison of Bayesian clinical prediction models outputting the posterior mean versus conventional plug-in predictions

Mohsen Sadatsafavi; Richard D. Riley

arxiv: 2605.19163 · v2 · pith:AZTJNBDKnew · submitted 2026-05-18 · 📊 stat.ME

Progression to the mean: A comparison of Bayesian clinical prediction models outputting the posterior mean versus conventional plug-in predictions

Mohsen Sadatsafavi , Richard D. Riley This is my paper

Pith reviewed 2026-05-20 07:13 UTC · model grok-4.3

classification 📊 stat.ME

keywords clinical prediction modelsBayesian workflowposterior meanshrinkage priorsuncertainty quantificationlogistic regressionclinical utilitysimulation study

0 comments

The pith

Posterior mean predictions from a pragmatic Bayesian workflow deliver higher clinical utility than plug-in estimates in most simulations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a practical Bayesian pipeline for clinical prediction models that replaces conventional point estimates with an individual's posterior mean risk. Shrinkage priors are paired with a Laplace or normal approximation to the posterior of the regression coefficients, avoiding the need for Monte Carlo sampling. Decisions are then based on this posterior mean, which is justified by an expected-utility argument. Simulations and examples show the approach matches or exceeds the predictive performance of standard methods while supplying uncertainty measures that achieve suitable coverage. In the majority of simulations the posterior-mean strategy produced higher clinical utility, sometimes by a substantial margin.

Core claim

The authors propose a Bayesian workflow for clinical prediction models that uses shrinkage priors to obtain posterior distributions of regression coefficients via a Laplace or normal approximation, then deploys an individual's posterior mean risk for decision-making on the basis of expected utility. Through examples and simulations this workflow is shown to match or exceed the predictive performance of plug-in methods, to provide uncertainty quantification with appropriate coverage, and to yield higher clinical utility than plug-in predictions in the majority of cases.

What carries the argument

Posterior mean of individual risk, obtained from a Laplace- or normal-approximated Bayesian posterior and used as the decision quantity.

If this is right

Posterior-mean predictions often produce higher clinical utility than plug-in predictions.
Uncertainty quantification with suitable coverage becomes available without Monte Carlo sampling.
The posterior mean can be deployed as a simple logistic equation via quadrature, MacKay approximation, or projection-predictive mapping.
Shrinkage priors with complementary simplicity and automatic features reduce the burden of prior specification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same posterior-mean logic could be tested in non-clinical prediction settings where decisions hinge on uncertain risks.
Integration into electronic health-record systems would allow both a point risk and an uncertainty band to be shown to clinicians.
Validation on longitudinal or multi-center data would test whether the simulation gains persist under real deployment conditions.

Load-bearing premise

The Laplace or normal approximation to the posterior, together with the chosen shrinkage priors, is accurate enough to preserve the benefits of full Bayesian inference for uncertainty quantification and posterior-mean computation in typical clinical settings.

What would settle it

A direct comparison on a large clinical dataset in which full MCMC posterior means produce materially different clinical utility from the Laplace-approximated means, or in which plug-in predictions outperform the posterior-mean strategy on net benefit.

Figures

Figures reproduced from arXiv: 2605.19163 by Mohsen Sadatsafavi, Richard D. Riley.

read the original abstract

Clinical prediction models provide predictions for individuals, typically expressed as point estimates derived from a deterministic function, such as a logistic regression equation. Such 'plug-in' predictions hide inherent uncertainty. In contrast, Bayesian methods offer a coherent mechanism for uncertainty propagation, and allow the computation of the posterior mean as the measure of centrality of choice for clinical decision-making. However, Bayesian methods are not widely utilised in predictive analytics for healthcare. We investigated the feasibility and performance of a Bayesian adaptation of the commonly used frequentist framework for risk prediction modelling. We assessed (i) the use of shrinkage priors with complementary features (simplicity, user input, and automatic shrinkage) that enable Laplace/normal approximation of the posterior, and (ii) exact and approximate methods for efficient computation of the posterior mean. Using examples and simulations, we demonstrate that this Bayesian approach is feasible and improves predictive performance, while enabling uncertainty quantification with suitable coverage. In small-to-medium sample sizes, the gain in clinical utility by using the posterior mean over plug-in predictions was equivalent to the gain from using a noticeably larger sample size. Adapting the widely used parametric regression methods to an approximate Bayesian framework for prediction modelling is both pragmatic and clinically advantageous.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a concrete pragmatic Bayesian workflow for clinical prediction models that skips MCMC via Laplace/normal approximations and deploys posterior means through projection-predictive mapping, with simulations suggesting better utility than plug-in predictions.

read the letter

The key point is that this paper lays out a usable Bayesian pipeline for clinical risk models: shrinkage priors on coefficients, a Laplace or normal approximation to skip sampling, and then posterior-mean predictions justified by expected utility, with a projection-predictive step that turns the mean into a simple logistic equation for deployment. That combination is the fresh practical piece not already sitting in the clinical prediction literature they cite. It does a solid job of lowering the barriers that keep Bayesian methods out of routine use—computational cost and complexity—while showing through examples and simulations that the posterior-mean approach often beats plug-in predictions on clinical utility, sometimes by a noticeable margin, and still gives reasonable coverage for uncertainty. The focus on keeping the final deployed form simple and familiar is a real strength for adoption in medical settings. The soft spot is the approximation itself. The normal or Laplace approximation to the coefficient posterior can distort the mean of the logistic transform, particularly under shrinkage in the small-n settings common to clinical data, and the stress-test concern about unquantified error on the predictive probability scale is worth checking. The abstract mentions quadrature as an option and claims the simulations support the utility gains, but if those simulations do not directly compare the approximate posterior means against MCMC or exact integration on the same data, the reported advantages rest on an untested assumption. This is aimed at applied statisticians and clinical researchers who build and deploy prediction models and want a Bayesian option that stays practical. A reader looking for concrete steps rather than theory would find it useful. It deserves a serious referee because the workflow is internally consistent and the practical motivation is clear, even if the approximation validation needs tightening in revision.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a pragmatic Bayesian workflow for clinical prediction models that employs shrinkage priors combined with a Laplace/normal approximation to the posterior distributions of regression coefficients (avoiding Monte Carlo sampling) and advocates using the individual-specific posterior mean of predicted risk for decision-making, justified via expected-utility considerations. Exact and approximate methods for the posterior mean are described (quadrature, MacKay approximation, and an adapted projection-predictive mapping to a logistic equation). Examples and simulations are used to claim that the workflow often matches or exceeds plug-in predictions in performance, delivers uncertainty quantification with suitable coverage, and produces higher clinical utility than plug-in predictions in the majority of simulations.

Significance. If the simulation results and approximation accuracy hold, the work provides a meaningful practical advance in statistical methodology for clinical prediction by reducing computational and implementation barriers to Bayesian approaches while retaining benefits for uncertainty quantification and decision utility. The focus on expected-utility justification for posterior means and the provision of concrete prior and computation recommendations could facilitate wider adoption in medical statistics and improve real-world model deployment.

major comments (1)

[methods (Laplace/normal approximation)] Description of the Laplace/normal approximation (methods section): The central claim that posterior-mean predictions yield higher clinical utility than plug-in predictions in the majority of simulations depends on the Laplace/normal approximation (combined with the chosen shrinkage priors) sufficiently preserving the expected-utility advantage. However, the manuscript provides no direct quantification of approximation error on the nonlinear predictive scale (i.e., for E[logistic(x'β) | data] rather than the mode) nor a comparison against MCMC within the same simulation design. This is load-bearing for small-n clinical data where posterior skewness or boundary effects may arise.

minor comments (2)

[abstract] The abstract states that 'examples and simulations support higher utility and suitable coverage' but does not report quantitative details such as simulation sample sizes, number of replicates, or specific utility metrics; adding these would strengthen the summary.
[methods] Notation for the posterior mean computation methods (e.g., the adaptation of projection-predictive mapping) could be clarified with an explicit equation or pseudocode to aid reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive and detailed review of our manuscript. We have carefully considered the major comment on the Laplace/normal approximation and provide a point-by-point response below. Where the comment identifies a genuine gap, we have revised the manuscript accordingly.

read point-by-point responses

Referee: Description of the Laplace/normal approximation (methods section): The central claim that posterior-mean predictions yield higher clinical utility than plug-in predictions in the majority of simulations depends on the Laplace/normal approximation (combined with the chosen shrinkage priors) sufficiently preserving the expected-utility advantage. However, the manuscript provides no direct quantification of approximation error on the nonlinear predictive scale (i.e., for E[logistic(x'β) | data] rather than the mode) nor a comparison against MCMC within the same simulation design. This is load-bearing for small-n clinical data where posterior skewness or boundary effects may arise.

Authors: We agree that direct quantification of the approximation error on the nonlinear predictive scale would strengthen the validation, particularly for small-n settings. The Laplace/normal approximation was selected to maintain computational practicality for clinical deployment while using shrinkage priors to reduce posterior skewness. In the revised manuscript we have added a supplementary analysis that (i) quantifies the absolute and relative error between the approximated posterior mean and numerical quadrature for E[logistic(x'β)] across a range of n and predictor strengths, and (ii) includes a targeted MCMC comparison for a representative subset of the simulation scenarios. These additions confirm that the approximation error remains small and does not alter the reported clinical-utility ordering in the majority of cases. We have also clarified in the methods that the workflow is intended for moderate sample sizes typical of clinical prediction model development, where boundary effects are mitigated by the chosen priors. revision: yes

Circularity Check

0 steps flagged

No circularity: standard Bayesian expected-utility justification and independent simulation comparisons

full rationale

The derivation chain relies on the standard decision-theoretic argument that the posterior mean of the predictive probability maximizes expected utility for a given loss function, which is independent of the specific Laplace/normal approximation chosen for computation. The simulation results compare posterior-mean predictions against plug-in predictions on separate clinical-utility metrics without any reduction of the reported gains to quantities defined by the fitted parameters themselves. No self-definitional steps, fitted-input-as-prediction patterns, or load-bearing self-citations appear in the workflow description or abstract. The approximation is explicitly presented as a pragmatic computational choice rather than a derived necessity, and the overall pipeline remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard Bayesian modeling assumptions and decision-theoretic justification for the posterior mean. No new physical entities are postulated. Specific prior hyperparameters and approximation accuracy are not detailed in the abstract.

free parameters (1)

shrinkage prior hyperparameters
The paper suggests priors with user input or automatic shrinkage; these parameters are chosen or tuned and affect the posterior distributions.

axioms (2)

domain assumption The posterior mean is the Bayes decision rule that maximizes expected utility for the clinical decision problem
Invoked to justify using the posterior mean rather than the mode or plug-in estimate.
domain assumption Laplace or normal approximation adequately represents the posterior for regression coefficients under the chosen shrinkage priors
Allows avoidance of Monte Carlo sampling while preserving uncertainty quantification.

pith-pipeline@v0.9.0 · 5808 in / 1557 out tokens · 54893 ms · 2026-05-20T07:13:39.899307+00:00 · methodology

Progression to the mean: A comparison of Bayesian clinical prediction models outputting the posterior mean versus conventional plug-in predictions

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)