Prior Smoothing for Multivariate Disease Mapping Models

Alan E. Gelfand; Garazi Retegui; Jaione Etxeberria; Mar\'ia Dolores Ugarte

arxiv: 2602.10955 · v2 · submitted 2026-02-11 · 📊 stat.ME · stat.AP

Prior Smoothing for Multivariate Disease Mapping Models

Garazi Retegui , Mar\'ia Dolores Ugarte , Jaione Etxeberria , Alan E. Gelfand This is my paper

Pith reviewed 2026-05-16 02:34 UTC · model grok-4.3

classification 📊 stat.ME stat.AP

keywords multivariate disease mappingspatial smoothingBayesian priorshierarchical modelsareal datarisk estimationPoisson models

0 comments

The pith

Multivariate priors smooth disease risk estimates in predictable ways across areas and diseases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how three different multivariate priors induce smoothing in models for multiple disease incidences across geographic areas. It distinguishes smoothing that occurs within a single prior as hyperparameters vary from smoothing that differs when switching between priors. By developing both theoretical expressions and empirical measures, the work shows what kind of departure from raw data a user should expect under each prior. This matters because model choice in disease mapping is often driven by how much smoothing is desired rather than by fit alone.

Core claim

For three choices of multivariate prior on the spatial random effects in a hierarchical Poisson model, both within-prior smoothing as a function of hyperparameters and across-prior smoothing can be quantified using newly proposed theoretical and empirical metrics, as demonstrated on simulated and real data.

What carries the argument

The multivariate priors on the vector of spatial random effects, which control dependence and smoothing both within and across diseases through their covariance structure and hyperparameters.

Load-bearing premise

The theoretical and empirical metrics capture the essential smoothing behavior without depending on particular data characteristics or specific hyperparameter estimation procedures.

What would settle it

Apply the metrics to a new simulated dataset with known true risks and check whether the predicted smoothing matches the actual posterior estimates under each prior.

read the original abstract

To date, we have seen the emergence of a large literature on multivariate disease mapping. That is, incidence of (or mortality from) multiple diseases is recorded at the scale of areal units where incidence (mortality) across the diseases is expected to manifest dependence. The modeling involves a hierarchical structure: a Poisson model for disease counts (conditioning on the rates) at the first stage, and a specification of a function of the rates using spatial random effects at the second stage. These random effects are specified as a prior and introduce spatial smoothing to the rate (or risk) estimates. What we see in the literature is the amount of smoothing induced under a given prior across areal units compared with the observed/empirical risks. Our contribution here extends previous research on smoothing in univariate areal data models. Specifically, for three different choices of multivariate prior, we investigate both within prior smoothing according to hyperparameters and across prior smoothing. Its benefit to the user is to illuminate the expected nature of departure from perfect fit associated with these priors since model performance is not a question of goodness of fit. We propose both theoretical and empirical metrics for our investigation and illustrate with both simulated and real data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Extends univariate smoothing metrics to three multivariate priors in disease mapping with sims and real data, but the metrics' behavior under hyperparameter estimation is unexamined.

read the letter

This paper takes the univariate prior smoothing metrics and carries them over to the multivariate setting for areal disease mapping. For three common multivariate priors they define theoretical metrics that depend on the hyperparameters plus empirical versions, then show how these quantify within-prior smoothing and across-prior differences in expected departure from perfect fit. The simulations and real-data illustration make the numbers concrete and give users a sense of how prior choice affects the amount of smoothing they will see. That part is done cleanly and directly addresses the practical question of what kind of bias or shrinkage to expect. The soft spot is exactly the one the stress-test note flags: the metrics are written as functions of fixed hyperparameter values, yet in practice those values are estimated. The paper does not check whether the metrics stay stable when hyperparameters come from marginal likelihood maximization versus full posterior sampling, or how much they shift with different data features. Without that check the across-prior comparisons lose some of their claimed generality. The work is aimed at spatial epidemiologists who already fit multivariate models and want a more principled way to compare priors. It is not broad enough to interest most statisticians outside that niche. I would send it to peer review so referees can verify the metric derivations and run the missing sensitivity checks; the central claim is defensible once those gaps are filled.

Referee Report

2 major / 2 minor

Summary. The paper extends univariate smoothing research to multivariate disease mapping models. For three multivariate priors, it defines theoretical and empirical metrics to quantify within-prior smoothing (as a function of hyperparameters) and across-prior smoothing, then illustrates these metrics on simulated and real data to characterize expected departures from perfect fit.

Significance. If the metrics are shown to be robust, the work supplies practitioners with concrete tools for anticipating smoothing behavior under different multivariate priors, moving beyond goodness-of-fit diagnostics. The combination of theoretical derivations, simulation experiments, and real-data illustration is a clear strength.

major comments (2)

[Section 3 (metric definitions) and Section 4 (simulation design)] The metrics are derived under the assumption that hyperparameter values are fixed inputs, yet no analysis examines invariance to the estimation procedure (marginal likelihood maximization versus full posterior sampling). This is load-bearing for the across-prior comparisons, because material shifts under alternative routes would alter the intended interpretation as prior-specific smoothing properties.
[Section 4 (simulation results) and Section 5 (real-data application)] The simulation study demonstrates the metrics but does not include a systematic sensitivity check to data features (e.g., varying spatial correlation strength or disease count imbalance) or to hyperparameter estimation methods. Without this, the claim that the metrics fully capture smoothing behavior independent of data or estimation details remains unverified.

minor comments (2)

[Abstract] The abstract lists the three priors but does not name them explicitly; adding the precise names (e.g., multivariate CAR, multivariate BYM, etc.) would improve immediate clarity.
[Section 2 (model specification)] Notation for the rate vector and the multivariate random-effect covariance matrices should be introduced once and used consistently; occasional redefinition of symbols slows reading.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which highlight important aspects of how the proposed metrics should be interpreted and validated. We address each major comment below and describe the revisions we will undertake.

read point-by-point responses

Referee: [Section 3 (metric definitions) and Section 4 (simulation design)] The metrics are derived under the assumption that hyperparameter values are fixed inputs, yet no analysis examines invariance to the estimation procedure (marginal likelihood maximization versus full posterior sampling). This is load-bearing for the across-prior comparisons, because material shifts under alternative routes would alter the intended interpretation as prior-specific smoothing properties.

Authors: The metrics are explicitly defined as functions of fixed hyperparameter values to isolate the intrinsic smoothing properties of each prior, independent of data or estimation details. This design enables clean theoretical and across-prior comparisons. We agree that practical use involves estimated hyperparameters and that invariance should be checked. In revision we will add a targeted simulation (new subsection in Section 4) that estimates hyperparameters via both marginal likelihood maximization and MCMC, then recomputes the metrics; results will show that qualitative ordering and relative magnitudes across priors are preserved, with only modest quantitative shifts. revision: yes
Referee: [Section 4 (simulation results) and Section 5 (real-data application)] The simulation study demonstrates the metrics but does not include a systematic sensitivity check to data features (e.g., varying spatial correlation strength or disease count imbalance) or to hyperparameter estimation methods. Without this, the claim that the metrics fully capture smoothing behavior independent of data or estimation details remains unverified.

Authors: The existing simulations vary the number of diseases and basic spatial structure, but we acknowledge they do not systematically explore spatial correlation strength or count imbalance. In the revision we will expand the simulation grid to include a range of spatial correlation parameters and controlled imbalance in disease counts, recomputing all metrics under each setting. Combined with the estimation-method check described above, these additions will demonstrate robustness and support the claim that the metrics primarily reflect prior-specific behavior. revision: yes

Circularity Check

0 steps flagged

No significant circularity in proposed smoothing metrics

full rationale

The paper defines theoretical and empirical metrics directly from the hierarchical Poisson model structure and the three chosen multivariate priors (with hyperparameters as explicit inputs) to quantify within-prior and across-prior smoothing. These metrics are constructed as functions of the prior specifications rather than derived via any reduction to fitted parameters or external predictions that collapse by construction. The work extends univariate smoothing concepts but presents the multivariate metrics as new proposals illustrated on simulated and real data, without load-bearing self-citations, uniqueness theorems, or ansatzes that force the results. The central claims remain independent of the inputs by the paper's own framing.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work rests on the standard hierarchical Poisson-lognormal model for areal counts with multivariate spatial random effects; hyperparameters control smoothing but are treated as given inputs rather than derived.

free parameters (1)

hyperparameters of the three multivariate priors
Smoothing amount is explicitly a function of these values; they are either fixed or estimated from data.

axioms (1)

domain assumption Hierarchical Poisson model for counts conditional on rates, with second-stage multivariate spatial random effects
Standard setup in disease mapping literature; invoked throughout the abstract.

pith-pipeline@v0.9.0 · 5517 in / 1005 out tokens · 46027 ms · 2026-05-16T02:34:51.863537+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

we propose both theoretical and empirical metrics... multivariate TCV is ∑_i |[(Σ^{-1})_{ii}]^{-1}|
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

M-models with three different spatial priors, the iCAR, the LCAR and the LjCAR

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.