A Flat Connection: The Pooling Factor and the Geometry of Centring in Hierarchical MCMC
Pith reviewed 2026-06-26 18:03 UTC · model grok-4.3
The pith
The Fisher-induced Ehresmann connection on hierarchical posteriors is flat, so the mixing obstruction reduces to the pooling factor.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Ehresmann connection A = -G_FF^{-1}G_BF induced by the Fisher information metric is flat for any smooth hierarchical posterior because its horizontal leaves coincide with the level sets of the fiber score ∂_α log p. There is therefore no geometric obstruction above the metric. The only remaining obstruction is the conditional dependence of the fiber parameters on the base parameters, governed per group by the prior fraction π_j (the pooling factor). From this quantity the paper recovers that prior-dominated groups mix slowly, that the optimal per-group non-centring weight follows in closed form, and that the funnel is a separate base-space pathology distinguished by its opposite dependen
What carries the argument
The Ehresmann connection induced by the Fisher information metric on the fiber bundle of hierarchical parameters, proved flat with horizontal leaves given by the level sets of the fiber score.
If this is right
- The optimal per-group non-centring weight is recoverable in closed form from the pooling factor π_j.
- Prior-dominated groups show excess conditional autocorrelation whose magnitude is predicted by π_j.
- The funnel pathology is separable from the pooling effect by their opposite dependence on the hierarchical variance.
- A direct attribution test confirms NUTS does not transport the fiber, with the chain-level footprint being conditional autocorrelation in prior-dominated groups.
- Genuine curvature appears only when the connection is built from a sampler's fixed working metric, making holonomy an algorithmic rather than geometric phenomenon.
Where Pith is reading between the lines
- Group-level mixing diagnostics could be constructed by estimating the pooling factor directly from posterior draws.
- The flatness result suggests that other apparent geometric obstructions in sampling algorithms may reduce to statistical dependence once the correct connection is identified.
- Models with rotational curvature under fixed-mass-matrix connections offer a testable distinction between algorithmic and intrinsic geometric effects.
Load-bearing premise
The joint parameter space of a hierarchical model forms a fiber bundle with hyperparameters as the base manifold and group-level parameters as the fibers.
What would settle it
A direct computation of the curvature two-form of the Fisher-induced connection A = -G_FF^{-1}G_BF on a smooth non-Gaussian hierarchical posterior that yields a non-zero result would falsify the flatness claim.
Figures
read the original abstract
Standard MCMC diagnostics ($\hat{R}$, effective sample size, divergence counts) detect whether a chain has mixed, but not why it has not. We ask whether the centring/non-centring obstruction in hierarchical models has a geometric cause beyond the metric. The joint parameter space is a fiber bundle (hyperparameters the base, group-level parameters the fibers), and the Fisher information metric induces an Ehresmann connection $A = -G_{FF}^{-1}G_{BF}$; the natural hypothesis is that the obstruction is its curvature, felt by the sampler as holonomy. We prove this false. The connection is flat for any smooth hierarchical posterior, not only the Gaussian case, because its horizontal leaves are the level sets of the fiber score $\partial_\alpha \log p$: there is no geometric obstruction above the metric. What remains is statistical, not geometric, and the flat connection identifies it as a single quantity: the conditional dependence of fiber on base, governed per group by the prior fraction $\pi_j$, the classical pooling factor. From it the framework recovers the established picture, that prior-dominated groups mix slowly and that the optimal per-group non-centring weight follows in closed form, and a simulation study separates this base-fiber coupling from the funnel, a distinct base-space pathology, by their opposite dependence on the hierarchical variance. A direct attribution test confirms that NUTS does not transport the fiber: the chain-level footprint is excess conditional autocorrelation in prior-dominated groups, exactly as $\pi_j$ predicts. Genuine, even rotational, curvature does appear, but only for connections built from a sampler's working metric (a fixed mass matrix), where holonomy re-enters as an algorithmic rather than geometric phenomenon. The prior-fraction diagnostic is distributed as the R package fibr, with the geometric methods as accompanying reproduction code.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript models the joint parameter space of a hierarchical model as a fiber bundle (hyperparameters as base, group-level parameters as fibers) and equips it with the Fisher information metric to induce an Ehresmann connection A = −G_FF^{-1}G_BF. It claims to prove that this connection is flat for any smooth hierarchical posterior (not merely Gaussian), because the horizontal leaves coincide exactly with the level sets of the fiber score ∂_α log p. Consequently there is no geometric obstruction above the metric; the centring/non-centring difficulty reduces to the classical per-group pooling factor π_j that governs conditional dependence of fiber on base. The paper recovers known mixing behaviour, separates this effect from the funnel pathology via simulation, and supplies an R package fibr together with reproduction code.
Significance. If the flatness result holds under the metric actually employed, the work supplies a clean geometric re-derivation of the pooling factor as the sole source of the centring obstruction and cleanly distinguishes it from the distinct base-space funnel pathology. The explicit attribution test with NUTS and the closed-form optimal non-centring weight are useful. The release of the fibr package and accompanying code strengthens reproducibility.
major comments (1)
- [Abstract] Abstract (and opening paragraph defining the connection): the central identification that horizontal vectors satisfy ds(X)=0 precisely when X_F = −(∂_F ∂_F log p)^{-1}(∂_B ∂_F log p) X_B holds if and only if the blocks of G are taken from the observed information −∇² log p. The conventional Fisher information metric uses the expectation E[−∇² log p] (or the score variance), which is a different tensor; under that choice the pointwise equality fails and flatness need not follow. The manuscript never states which definition of G is used, yet asserts flatness “for any smooth hierarchical posterior” without qualification. This is load-bearing for the claim that there is “no geometric obstruction above the metric.”
minor comments (1)
- The simulation study that separates base-fiber coupling from the funnel by their opposite dependence on hierarchical variance is mentioned only in the abstract; a brief description of the design (number of groups, range of π_j, metrics used) would help readers assess the separation.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review. The observation regarding the precise definition of the Fisher information metric is well taken; we address it directly below and will revise the manuscript to make the choice explicit.
read point-by-point responses
-
Referee: [Abstract] Abstract (and opening paragraph defining the connection): the central identification that horizontal vectors satisfy ds(X)=0 precisely when X_F = −(∂_F ∂_F log p)^{-1}(∂_B ∂_F log p) X_B holds if and only if the blocks of G are taken from the observed information −∇² log p. The conventional Fisher information metric uses the expectation E[−∇² log p] (or the score variance), which is a different tensor; under that choice the pointwise equality fails and flatness need not follow. The manuscript never states which definition of G is used, yet asserts flatness “for any smooth hierarchical posterior” without qualification. This is load-bearing for the claim that there is “no geometric obstruction above the metric.”
Authors: We agree that the definition of G must be stated explicitly. The paper employs the observed information matrix G = −∇² log p (negative Hessian of the log-posterior evaluated pointwise), not its expectation. This is the tensor for which the horizontal distribution is exactly the kernel of the fiber-score map ds, so that the horizontal leaves coincide with the level sets of ∂_α log p and the connection is flat for any smooth posterior. The conventional expected Fisher metric would not yield this pointwise identification. We will revise the abstract and the introductory paragraphs that define the connection to specify that the metric is the observed information tensor. revision: yes
Circularity Check
No significant circularity; derivation is self-contained mathematical identification
full rationale
The paper defines the Ehresmann connection A = -G_FF^{-1}G_BF from the Fisher metric on the fiber bundle and proves flatness by showing that the horizontal condition matches the level sets of the fiber score ∂_α log p via direct differentiation. This is a definitional equivalence derived from the given objects rather than a reduction to fitted inputs or presupposed results. The subsequent identification of the pooling factor π_j follows as a statistical interpretation of the resulting flat geometry and recovers known behavior without circular renaming or self-citation load-bearing. No steps match the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The joint parameter space of a hierarchical model can be modeled as a fiber bundle with hyperparameters as base and group-level parameters as fibers.
- domain assumption The Fisher information metric on this bundle induces an Ehresmann connection given by A = -G_FF^{-1}G_BF.
Reference graph
Works this paper leans on
-
[1]
Methods of Information Geometry, volume 191 of Translations of Mathematical Monographs
Shun-ichi Amari and Hiroshi Nagaoka. Methods of Information Geometry, volume 191 of Translations of Mathematical Monographs. American Mathematical Society, 2000
2000
-
[5]
Hamiltonian Monte Carlo for Hierarchical Models
Michael Betancourt and Mark Girolami. Hamiltonian Monte Carlo for Hierarchical Models . In S.K. Upadhyay, U. Singh, D.K. Dey, and A. Loganathan, editors, Current Trends in Bayesian Methodology with Applications, pages 79--101. CRC Press, 2015
2015
-
[7]
Aidan D. Bindoff. smoothbp: Hierarchical Piecewise Regression with Smoothed Change-Points , 2026 b . URL https://CRAN.R-project.org/package=smoothbp. R package version 0.2.3
2026
-
[8]
posterior: Tools for Working with Posterior Distributions , 2022
Paul-Christian B\" u rkner, Jonah Gabry, Matthew Kay, and Aki Vehtari. posterior: Tools for Working with Posterior Distributions , 2022. URL https://mc-stan.org/posterior/. R package version 1.4.0
2022
-
[9]
Bob Carpenter, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. Stan: A Probabilistic Programming Language . Journal of Statistical Software, 76 0 (1): 0 1--32, 2017. doi:10.18637/jss.v076.i01
-
[10]
cmdstanr: R Interface to CmdStan , 2024
Jonah Gabry, Rok C e s novar, Andrew Johnson, and Steve Bronder. cmdstanr: R Interface to CmdStan , 2024. URL https://mc-stan.org/cmdstanr
2024
-
[11]
Bayesian measures of explained variance and pooling in multilevel (hierarchical) models
Andrew Gelman and Iain Pardoe. Bayesian measures of explained variance and pooling in multilevel (hierarchical) models. Technometrics, 48 0 (2): 0 241--251, 2006. doi:10.1198/004017005000000517
-
[12]
Andrew Gelman and Donald B. Rubin. Inference from Iterative Simulation Using Multiple Sequences . Statistical Science, 7 0 (4): 0 457--472, 1992
1992
-
[13]
Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods
Mark Girolami and Ben Calderhead. Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods . Journal of the Royal Statistical Society: Series B, 73 0 (2): 0 123--214, 2011
2011
-
[14]
Gorinova, Dave Moore, and Matthew D
Maria I. Gorinova, Dave Moore, and Matthew D. Hoffman. Automatic reparameterisation of probabilistic programs. In Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 3648--3657, 2020
2020
-
[15]
Tore Selland Kleppe. Log-Density Gradient Covariance and Automatic Metric Tensors for Riemannian Manifold Monte Carlo Methods . Scandinavian Journal of Statistics, 51 0 (3): 0 1206--1229, 2024. doi:10.1111/sjos.12705
-
[16]
Foundations of Differential Geometry, Volume I
Shoshichi Kobayashi and Katsumi Nomizu. Foundations of Differential Geometry, Volume I . Wiley Interscience, 1963
1963
-
[17]
On the Geometric Ergodicity of Hamiltonian Monte Carlo
Samuel Livingstone, Michael Betancourt, Simon Byrne, and Mark Girolami. On the Geometric Ergodicity of Hamiltonian Monte Carlo . Bernoulli, 25 0 (4A): 0 3109--3138, 2019
2019
-
[18]
Geometry, Topology and Physics
Mikio Nakahara. Geometry, Topology and Physics . CRC Press, 2nd edition, 2003
2003
-
[19]
Roberts, and Martin Sk\" o ld
Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sk\" o ld. Non-Centered Parameterisations for Hierarchical Models and Data Augmentation . In J.M. Bernardo, M.J. Bayarri, J.O. Berger, A.P. Dawid, D. Heckerman, A.F.M. Smith, and M. West, editors, Bayesian Statistics 7, pages 307--326. Oxford University Press, 2003
2003
-
[20]
Roberts, and Martin Sk\" o ld
Omiros Papaspiliopoulos, Gareth O. Roberts, and Martin Sk\" o ld. A General Framework for the Parametrization of Hierarchical Models . Statistical Science, 22 0 (1): 0 59--73, 2007
2007
-
[21]
H vard Rue, Sara Martino, and Nicolas Chopin. Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations. Journal of the Royal Statistical Society: Series B, 71 0 (2): 0 319--392, 2009. doi:10.1111/j.1467-9868.2008.00700.x
-
[22]
Rank-Normalization, Folding, and Localization: An Improved R for Assessing Convergence of MCMC
Aki Vehtari, Andrew Gelman, Daniel Simpson, Bob Carpenter, and Paul-Christian B\" u rkner. Rank-Normalization, Folding, and Localization: An Improved R for Assessing Convergence of MCMC . Bayesian Analysis, 16 0 (2): 0 667--718, 2021
2021
-
[23]
Yaming Yu and Xiao-Li Meng. To center or not to center: That is not the question---an ancillarity--sufficiency interweaving strategy ( ASIS ) for boosting MCMC efficiency. Journal of Computational and Graphical Statistics, 20 0 (3): 0 531--570, 2011. doi:10.1198/jcgs.2011.203main
-
[24]
Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models
Yichuan Zhang and Charles Sutton. Semi-Separable Hamiltonian Monte Carlo for Inference in Bayesian Hierarchical Models . arXiv preprint arXiv:1406.3843, 2014
Pith/arXiv arXiv 2014
-
[25]
Journal of the Royal Statistical Society: Series B , year =
Girolami, Mark and Calderhead, Ben , title =. Journal of the Royal Statistical Society: Series B , year =
-
[26]
arXiv preprint arXiv:1212.4693 , year =
Betancourt, Michael , title =. arXiv preprint arXiv:1212.4693 , year =
-
[27]
Current Trends in Bayesian Methodology with Applications , editor =
Betancourt, Michael and Girolami, Mark , title =. Current Trends in Bayesian Methodology with Applications , editor =. 2015 , pages =
2015
-
[28]
and Sk\"
Papaspiliopoulos, Omiros and Roberts, Gareth O. and Sk\". Bayesian Statistics 7 , editor =. 2003 , pages =
2003
-
[29]
and Sk\"
Papaspiliopoulos, Omiros and Roberts, Gareth O. and Sk\". Statistical Science , year =
-
[30]
2014 , note =
Zhang, Yichuan and Sutton, Charles , title =. 2014 , note =
2014
-
[31]
Scandinavian Journal of Statistics , year =
Kleppe, Tore Selland , title =. Scandinavian Journal of Statistics , year =
-
[32]
, title =
Neal, Radford M. , title =. Handbook of Markov Chain Monte Carlo , editor =. 2011 , chapter =
2011
-
[33]
and Gelman, Andrew , title =
Hoffman, Matthew D. and Gelman, Andrew , title =. Journal of Machine Learning Research , year =
-
[34]
, title =
Gelman, Andrew and Rubin, Donald B. , title =. Statistical Science , year =
-
[35]
Bayesian Analysis , year =
Vehtari, Aki and Gelman, Andrew and Simpson, Daniel and Carpenter, Bob and B\". Bayesian Analysis , year =
-
[36]
arXiv preprint arXiv:1701.02434 , year =
Betancourt, Michael , title =. arXiv preprint arXiv:1701.02434 , year =
-
[37]
and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , title =
Carpenter, Bob and Gelman, Andrew and Hoffman, Matthew D. and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Marcus and Guo, Jiqiang and Li, Peter and Riddell, Allen , title =. Journal of Statistical Software , year =
-
[38]
2024 , url =
Gabry, Jonah and. 2024 , url =
2024
-
[39]
Kobayashi, Shoshichi and Nomizu, Katsumi , title =
-
[40]
Nakahara, Mikio , title =
-
[41]
Bernoulli , year =
Livingstone, Samuel and Betancourt, Michael and Byrne, Simon and Girolami, Mark , title =. Bernoulli , year =
-
[42]
Bernoulli , year =
Beskos, Alexandros and Pillai, Natesh and Roberts, Gareth and Sanz-Serna, Jesus-Maria and Stuart, Andrew , title =. Bernoulli , year =
-
[43]
Bernoulli , year =
Atchad\'. Bernoulli , year =
-
[44]
and Lan, Shiwei and Vandenberg-Rodes, Alexander and Shahbaba, Babak , title =
Holbrook, Andrew J. and Lan, Shiwei and Vandenberg-Rodes, Alexander and Shahbaba, Babak , title =. Journal of Statistical Computation and Simulation , year =
-
[45]
, year =
Bindoff, Aidan D. , year =
-
[46]
Bindoff, Aidan D. , title =. 2026 , note =. doi:10.5281/zenodo.20724550 , url =
-
[47]
Journal of Computational and Graphical Statistics , year =
Yu, Yaming and Meng, Xiao-Li , title =. Journal of Computational and Graphical Statistics , year =
-
[48]
and Moore, Dave and Hoffman, Matthew D
Gorinova, Maria I. and Moore, Dave and Hoffman, Matthew D. , title =. Proceedings of the 37th International Conference on Machine Learning , series =
-
[49]
Approximate
Rue, H. Approximate. Journal of the Royal Statistical Society: Series B , year =
-
[50]
Technometrics , year =
Gelman, Andrew and Pardoe, Iain , title =. Technometrics , year =
-
[51]
Amari, Shun-ichi and Nagaoka, Hiroshi , title =
-
[52]
arXiv preprint arXiv:1910.09407 , year =
Betancourt, Michael , title =. arXiv preprint arXiv:1910.09407 , year =
arXiv 1910
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.