Recognition: unknown
Intrinsic effective sample size for manifold-valued Markov chain Monte Carlo via kernel discrepancy
Pith reviewed 2026-05-07 13:44 UTC · model grok-4.3
The pith
Kernel discrepancy defines a coordinate-free effective sample size for manifold-valued Markov chain Monte Carlo.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
An intrinsic effective sample size is defined as the number of independent samples that would match the expected squared kernel discrepancy of the MCMC empirical measure to the target measure. The definition yields an exact finite-sample risk interpretation, an asymptotic integrated-autocorrelation representation, invariance under transported kernels, and operator and principal-direction interpretations. A lag-window estimator is shown to be consistent under boundedness and absolute-regularity conditions, and valid kernel constructions on manifolds are discussed, noting that geodesic Gaussian kernels are not generally positive definite on curved spaces.
What carries the argument
Squared kernel discrepancy between the empirical distribution of the chain and the target distribution, used to equate MCMC performance to an equivalent number of independent draws.
If this is right
- The diagnostic remains unchanged under rotations, chart changes, or alternative embeddings of the same manifold path.
- A lag-window estimator of the quantity is consistent whenever the chain meets standard mixing conditions.
- The measure admits an interpretation in terms of the reproducing kernel Hilbert space operator and its principal directions.
- Only kernels that respect the manifold geometry are admissible; geodesic Gaussian kernels generally fail positive-definiteness on curved spaces.
Where Pith is reading between the lines
- The same discrepancy-based construction could serve as a diagnostic for other geometry-preserving samplers such as manifold Hamiltonian Monte Carlo.
- Researchers could calibrate the measure on additional manifolds like the Stiefel or Grassmannian to check whether the finite-sample risk interpretation holds in practice.
- The approach suggests a route to intrinsic diagnostics for other distributional functionals beyond effective sample size.
Load-bearing premise
Kernels exist that are positive definite and compatible with the manifold geometry, and the chain satisfies boundedness and absolute regularity so that the lag-window estimator converges.
What would settle it
On the sphere, rotate the MCMC samples and recompute both the proposed effective sample size and the empirical distributional error; if the effective sample size changes while the error stays fixed, the invariance claim fails.
Figures
read the original abstract
Effective sample size is a standard summary of Markov chain Monte Carlo output, but it is usually attached to scalar or Euclidean summaries chosen by the analyst. For manifold-valued samples this choice is not canonical: coordinate-wise effective sample sizes can change under rotations, chart changes, or alternative embeddings of the same underlying path. We propose an intrinsic effective sample size based on kernel discrepancy. The proposed quantity is the number of independent draws that would yield the same expected squared kernel discrepancy between the empirical distribution and the target distribution. This gives an exact finite-sample risk interpretation, an asymptotic integrated-autocorrelation representation, and a coordinate-free diagnostic whenever the kernel respects the geometry of the state space. We establish invariance under transported kernels, operator and principal-direction interpretations, and consistency of a lag-window estimator under boundedness and absolute-regularity conditions. We also discuss valid kernel constructions on manifolds, emphasizing that geodesic Gaussian kernels are not generally positive definite on curved spaces. Sphere experiments illustrate rotation invariance and calibration of the proposed diagnostic against empirical distributional error.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an intrinsic effective sample size (ESS) for manifold-valued Markov chain Monte Carlo (MCMC) based on kernel discrepancy. The quantity is defined as the number of i.i.d. draws that would produce the same expected squared kernel discrepancy between the empirical distribution and the target as the MCMC chain does. It claims this yields an exact finite-sample risk interpretation, an asymptotic integrated-autocorrelation representation, invariance under transported kernels, operator and principal-direction interpretations, and consistency of a lag-window estimator under boundedness and absolute-regularity conditions. Valid kernel constructions on manifolds are discussed (noting geodesic Gaussian kernels are not generally positive definite on curved spaces), and sphere experiments illustrate rotation invariance and calibration against empirical distributional error.
Significance. If the mathematical claims hold, the work would provide a coordinate-free diagnostic for MCMC efficiency on manifolds, filling a gap where standard scalar or Euclidean ESS summaries are not invariant to coordinate choices, chart changes, or embeddings. The finite-sample risk interpretation and invariance properties would be useful in applications such as directional statistics and shape analysis. Credit is due for the explicit discussion of kernel constructions and the operator interpretations, which strengthen the proposal beyond ad-hoc definitions.
major comments (3)
- [Kernel constructions and invariance claims] The central definition of the intrinsic ESS rests on the kernel being positive definite and respecting manifold geometry so that the discrepancy is a valid squared RKHS distance and the invariance claims attach. The abstract notes that geodesic Gaussian kernels fail to be positive definite on curved spaces; the manuscript must therefore supply explicit, verifiable constructions for the manifolds of interest (e.g., spheres, SO(3)) together with proofs that the resulting discrepancy yields the claimed finite-sample risk and asymptotic IAC representations, otherwise the load-bearing interpretations do not hold.
- [Consistency of lag-window estimator] The abstract asserts consistency of the lag-window estimator under boundedness and absolute-regularity conditions, yet no derivation, error bounds, or explicit statement of the required mixing rates appears in the provided summary. Because this consistency underpins the practical use of the diagnostic, the full manuscript must include the proof (or a clear reference to standard results) with the precise conditions stated.
- [Experiments] Sphere experiments are invoked to illustrate rotation invariance and calibration against empirical distributional error, but the summary provides no quantitative results, baselines, or error analysis. Without these, it is impossible to evaluate whether the proposed ESS indeed calibrates correctly or outperforms coordinate-wise alternatives on the claimed manifolds.
minor comments (2)
- [Notation and definitions] Clarify the precise definition of the kernel discrepancy (including any normalization) and the exact formula for the intrinsic ESS early in the manuscript so that the finite-sample risk interpretation is immediately readable.
- [Assumptions] Ensure all assumptions (positive-definiteness, boundedness, absolute regularity) are collected in a single statement rather than scattered across the text.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive review of our manuscript. The comments identify areas where additional detail will strengthen the presentation, and we address each point below with our planned revisions.
read point-by-point responses
-
Referee: [Kernel constructions and invariance claims] The central definition of the intrinsic ESS rests on the kernel being positive definite and respecting manifold geometry so that the discrepancy is a valid squared RKHS distance and the invariance claims attach. The abstract notes that geodesic Gaussian kernels fail to be positive definite on curved spaces; the manuscript must therefore supply explicit, verifiable constructions for the manifolds of interest (e.g., spheres, SO(3)) together with proofs that the resulting discrepancy yields the claimed finite-sample risk and asymptotic IAC representations, otherwise the load-bearing interpretations do not hold.
Authors: We agree that explicit constructions are needed to make the invariance and risk interpretations fully verifiable. The manuscript already discusses valid kernel constructions on manifolds and explicitly notes the failure of geodesic Gaussians on curved spaces, proposing geometry-respecting alternatives. To address the request, we will expand this section with concrete, verifiable examples for the sphere (e.g., kernels based on spherical harmonics or the von Mises-Fisher family) and for SO(3) (using bi-invariant kernels on the rotation group), together with short arguments or references confirming that the resulting discrepancy satisfies the finite-sample expected squared risk and the asymptotic integrated autocorrelation representations. revision: yes
-
Referee: [Consistency of lag-window estimator] The abstract asserts consistency of the lag-window estimator under boundedness and absolute-regularity conditions, yet no derivation, error bounds, or explicit statement of the required mixing rates appears in the provided summary. Because this consistency underpins the practical use of the diagnostic, the full manuscript must include the proof (or a clear reference to standard results) with the precise conditions stated.
Authors: The consistency result is stated in the manuscript under the given conditions, but we accept that a self-contained derivation or precise reference will improve accessibility. In the revision we will add the proof sketch, drawing on standard results for kernel mean embeddings under absolute regularity, and explicitly state the required mixing rates (summability of the alpha-mixing coefficients). If space is limited we will cite the relevant theorem from the dependent-data literature and verify that our boundedness assumption suffices. revision: yes
-
Referee: [Experiments] Sphere experiments are invoked to illustrate rotation invariance and calibration against empirical distributional error, but the summary provides no quantitative results, baselines, or error analysis. Without these, it is impossible to evaluate whether the proposed ESS indeed calibrates correctly or outperforms coordinate-wise alternatives on the claimed manifolds.
Authors: The sphere experiments in the manuscript demonstrate rotation invariance and calibration, but we acknowledge that more quantitative detail is warranted. We will expand the experimental section to include numerical tables, explicit comparisons against coordinate-wise ESS, baseline methods, and error bars or calibration plots that quantify agreement with empirical distributional error. This will allow direct assessment of performance on the manifolds considered. revision: yes
Circularity Check
Intrinsic ESS defined directly via kernel discrepancy without circular reduction to inputs
full rationale
The paper defines the proposed intrinsic effective sample size explicitly as the number of i.i.d. draws yielding the same expected squared kernel discrepancy as the MCMC empirical measure. This is a direct definitional construction rather than a derivation that reduces by construction to fitted parameters, self-referential equations, or load-bearing self-citations. The abstract establishes invariance, operator interpretations, and lag-window consistency under external assumptions (boundedness, absolute regularity, and existence of suitable positive definite kernels), but these are independent properties attached to the definition, not reductions of it. No quoted steps in the provided material exhibit self-definition, fitted-input prediction, or ansatz smuggling; the kernel choice and manifold geometry respect are treated as external preconditions, not internal circularities. The chain is therefore self-contained against the stated benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Existence of positive definite kernels that respect the manifold geometry
- domain assumption Boundedness and absolute-regularity conditions on the Markov chain
Reference graph
Works this paper leans on
-
[1]
Feragen, Aasa and Lauze, Francois and Hauberg, Soren , month = jun, year =. Geodesic exponential kernels:. 2015. doi:10.1109/CVPR.2015.7298922 , urldate =
-
[2]
SIAM Journal on Matrix Analysis and Applications , author =
Geometric. SIAM Journal on Matrix Analysis and Applications , author =. 2007 , pages =. doi:10.1137/050637996 , number =
-
[3]
and Salzmann, Mathieu and Hartley, Richard , editor =
Harandi, Mehrtash T. and Salzmann, Mathieu and Hartley, Richard , editor =. From. Computer. 2014 , pages =. doi:10.1007/978-3-319-10605-2_2 , language =
-
[4]
SIAM Journal on Matrix Analysis and Applications , author =
Theoretically and. SIAM Journal on Matrix Analysis and Applications , author =. 2022 , pages =. doi:10.1137/22M1471729 , language =
-
[5]
Thanwerdas, Yann and Pennec, Xavier , editor =. Geodesics and. Geometric. 2021 , pages =. doi:10.1007/978-3-030-80209-7_11 , language =
-
[6]
Journal of Machine Learning Research , author =
Universality,. Journal of Machine Learning Research , author =. 2011 , pages =
2011
-
[7]
Strictly and non-strictly positive definite functions on spheres , volume =. Bernoulli , author =. doi:10.3150/12-BEJSP06 , number =
-
[8]
Duke Mathematical Journal , author =
Positive definite functions on spheres , volume =. Duke Mathematical Journal , author =. doi:10.1215/S0012-7094-42-00908-6 , number =
-
[9]
Equivalence of distance-based and RKHS -based statistics in hypothesis testing
Equivalence of distance-based and. The Annals of Statistics , author =. doi:10.1214/13-AOS1140 , number =
-
[10]
and Rasch, Malte J
Gretton, Arthur and Borgwardt, Karsten M. and Rasch, Malte J. and Schölkopf, Bernhard and Smola, Alexander , month = mar, year =. A. Journal of Machine Learning Research , publisher =
-
[11]
Bhatia, Rajendra , month = dec, year =. Positive. doi:10.1515/9781400827787 , urldate =
-
[12]
Arnaudon, Marc and Barbaresco, Frédéric and Yang, Le , editor =. Medians and. Matrix. 2013 , note =. doi:10.1007/978-3-642-30232-9_8 , language =
-
[13]
Econometrics and Statistics , author =
Modeling. Econometrics and Statistics , author =. 2022 , pages =. doi:10.1016/j.ecosta.2021.04.004 , language =
-
[14]
Petersen, Peter , year =. Riemannian. doi:10.1007/978-0-387-29403-2 , publisher =
-
[15]
You, Kisung , year =. Finite. doi:10.48550/ARXIV.2604.24895 , abstract =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.24895
-
[16]
The Annals of Statistics , author =
Fréchet regression for random objects with. The Annals of Statistics , author =. doi:10.1214/17-AOS1624 , number =
-
[17]
The. NeuroImage , author =. 2012 , pages =. doi:10.1016/j.neuroimage.2012.02.018 , language =
-
[18]
and Pedersen, M.S
Petersen, K.B. and Pedersen, M.S. , month = nov, year =. The
-
[19]
Prediction of. Science , author =. 2010 , pages =. doi:10.1126/science.1194144 , language =
-
[20]
A wrapped normal distribution on hyperbolic space for gradient-based learning , volume =
Nagano, Yoshihiro and Yamaguchi, Shoichiro and Fujita, Yasuhiro and Koyama, Masanori , editor =. A wrapped normal distribution on hyperbolic space for gradient-based learning , volume =. Proceedings of the 36th international conference on machine learning , publisher =. 2019 , pages =
2019
-
[21]
, year =
Lee, John M. , year =. Riemannian
-
[22]
Le Cam, Lucien and Lo Yang, Grace , year =. Asymptotics in. doi:10.1007/978-1-4612-1166-2 , publisher =
-
[23]
The Annals of Mathematical Statistics , author =
Asymptotic. The Annals of Mathematical Statistics , author =. 1971 , pages =. doi:10.1214/aoms/1177693066 , language =
-
[24]
The Annals of Mathematical Statistics , author =
Note on the. The Annals of Mathematical Statistics , author =. 1949 , pages =. doi:10.1214/aoms/1177729952 , language =
-
[25]
Local asymptotic minimax and admissibility in estimation , language =
Hajek, Jaroslav , year =. Local asymptotic minimax and admissibility in estimation , language =. Proceedings of the
-
[26]
Cox, D.R. , year =. Analysis of. doi:10.1201/9781315137391 , urldate =
-
[27]
Journal of the Royal Statistical Society Series B: Statistical Methodology , author =
An. Journal of the Royal Statistical Society Series B: Statistical Methodology , author =. 1964 , pages =. doi:10.1111/j.2517-6161.1964.tb00553.x , abstract =
-
[28]
Computational Statistics & Data Analysis , author =
Parameter estimation and model-based clustering with spherical normal distribution on the unit hypersphere , volume =. Computational Statistics & Data Analysis , author =. 2022 , pages =. doi:10.1016/j.csda.2022.107457 , urldate =
-
[29]
Li, Songzi and Rizzo, Maria L. , month = nov, year =. K-groups:. doi:10.48550/arXiv.1711.04359 , abstract =
-
[30]
Hauberg, Soren , month = jul, year =. Directional. 2018 21st. doi:10.23919/ICIF.2018.8455242 , urldate =
-
[31]
R., Falorsi, L., De Cao, N., Kipf, T., and Tomczak, J
Davidson, Tim R. and Falorsi, Luca and Cao, Nicola De and Kipf, Thomas and Tomczak, Jakub M. , month = sep, year =. Hyperspherical. doi:10.48550/arXiv.1804.00891 , abstract =
-
[32]
Directional. 2022 , note =. doi:10.1007/978-981-19-1044-9 , language =
-
[33]
You, Kisung and Shung, Dennis , year =. On the spherical. doi:10.48550/ARXIV.2208.11929 , abstract =
-
[34]
Fuzzy communities and the concept of bridgeness in complex networks , volume =. Physical Review E , author =. 2008 , pages =. doi:10.1103/PhysRevE.77.016107 , number =
-
[35]
Journal of Anthropological Research , author =
An. Journal of Anthropological Research , author =. 1977 , pages =. doi:10.1086/jar.33.4.3629752 , number =
-
[36]
Journal of Complex Networks , author =
Hydra: a method for strain-minimizing hyperbolic embedding of network- and distance-based data , volume =. Journal of Complex Networks , author =. 2020 , pages =. doi:10.1093/comnet/cnaa002 , number =
-
[37]
Numerische Mathematik , author =
A note on two problems in connexion with graphs , volume =. Numerische Mathematik , author =. 1959 , pages =. doi:10.1007/BF01386390 , number =
-
[38]
Foundations of Data Science , author =
Tverberg's theorem and multi-class support vector machines , volume =. Foundations of Data Science , author =. 2025 , pages =. doi:10.3934/fods.2024039 , number =
-
[39]
Multivariate output analysis for. Biometrika , author =. 2019 , pages =. doi:10.1093/biomet/asz002 , language =
-
[40]
Fréchet analysis of variance for random objects , volume =. Biometrika , author =. 2019 , pages =. doi:10.1093/biomet/asz052 , abstract =
-
[41]
Proceedings of the American Mathematical Society , author =
Omnibus. Proceedings of the American Mathematical Society , author =. 2016 , pages =. doi:10.1090/proc/13216 , number =
-
[42]
, month = may, year =
Ginestet, Cedric E. , month = may, year =. Strong
-
[43]
Journal of Multivariate Analysis , author =
The. Journal of Multivariate Analysis , author =. 1982 , pages =. doi:10.1016/0047-259X(82)90077-X , language =
-
[44]
Annales de l'institut Henri Poincaré , author =
Les éléments aléatoires de nature quelconque dans un espace distancié , volume =. Annales de l'institut Henri Poincaré , author =. 1948 , pages =
1948
-
[45]
Boumal, Nicolas , year =. An. doi:10.1017/9781009166164 , language =
-
[46]
Mishne, Gal and Wan, Zhengchao and Wang, Yusu and Yang, Sheng , year =. The. Proceedings of the 40th
-
[47]
and Bedo, Justin , month = jun, year =
Song, Le and Smola, Alex and Gretton, Arthur and Borgwardt, Karsten M. and Bedo, Justin , month = jun, year =. Supervised feature selection via dependence estimation , isbn =. Proceedings of the 24th international conference on. doi:10.1145/1273496.1273600 , language =
-
[48]
Dryden, I. L. and Mardia, K. V. , year =. Statistical shape analysis , isbn =
-
[49]
Dryden, I. L. and Mardia, K. V. , year =. Statistical shape analysis with applications in
-
[50]
Mardia, K. V. and Jupp, Peter E. , year =. Directional statistics , isbn =
-
[51]
Journal of the Royal Statistical Society: Series B (Methodological) , author =
Maximum. Journal of the Royal Statistical Society: Series B (Methodological) , author =. 1977 , pages =. doi:10.1111/j.2517-6161.1977.tb01600.x , number =
-
[52]
Differentiating through the
Lou, Aaron and Katsman, Isay and Jiang, Qingxuan and Belongie, Serge and Lim, Ser-Nam and De Sa, Christopher , year =. Differentiating through the. Proceedings of the 37th
-
[53]
Qiu, Rui and Yu, Zhou and Zhu, Ruoqing , month = jan, year =. Random. Journal of Machine Learning Research , publisher =
-
[54]
Computational Statistics & Data Analysis , author =
Medoid splits for efficient random forests in metric spaces , volume =. Computational Statistics & Data Analysis , author =. 2024 , pages =. doi:10.1016/j.csda.2024.107995 , language =
-
[55]
Journal of Statistical Planning and Inference , author =
Design and relative efficiency in two-phase studies , volume =. Journal of Statistical Planning and Inference , author =. 2012 , pages =. doi:10.1016/j.jspi.2012.04.013 , language =
-
[56]
Riemannian radial distributions on
Chen, Hengchao , year =. Riemannian radial distributions on. doi:10.48550/ARXIV.2405.07852 , abstract =
-
[57]
IEEE Transactions on Information Theory , author =
Gaussian. IEEE Transactions on Information Theory , author =. 2018 , pages =. doi:10.1109/TIT.2017.2713829 , number =
-
[58]
Journal of Computational and Graphical Statistics , author =
Bayesian. Journal of Computational and Graphical Statistics , author =. 2024 , pages =. doi:10.1080/10618600.2024.2308219 , language =
-
[59]
Navigability of. Scientific Reports , author =. 2017 , pages =. doi:10.1038/s41598-017-08872-4 , abstract =
-
[60]
Sarkar, Rik , editor =. Low. 2012 , pages =. doi:10.1007/978-3-642-25878-7_34 , booktitle =
-
[61]
Learning continuous hierarchies in the
Nickel, Maximillian and Kiela, Douwe , editor =. Learning continuous hierarchies in the. Proceedings of the 35th. 2018 , pages =
2018
-
[62]
Poincaré embeddings for learning hierarchical representations , volume =
Nickel, Maximillian and Kiela, Douwe , editor =. Poincaré embeddings for learning hierarchical representations , volume =. Advances in
-
[63]
Journal of the American Statistical Association , author =
Hyperbolic. Journal of the American Statistical Association , author =. 2026 , pages =. doi:10.1080/01621459.2026.2635077 , language =
-
[64]
, collaborator =
Abramowitz, Milton and Stegun, Irene A. , collaborator =. Handbook of
-
[65]
Acemoglu, Daron and Autor, David , year =. Skills,. Handbook of. doi:10.1016/S0169-7218(11)02410-5 , language =
-
[66]
American Economic Review , author =
The. American Economic Review , author =. 2018 , pages =. doi:10.1257/aer.20160696 , abstract =
-
[67]
An in depth look at the
Adamo, Davide and Corneli, Marco and Vuillien, Manon and Vila, Emmanuelle , year =. An in depth look at the. Forty-second
-
[68]
doi:10.18637/jss.v050.i05 , language =
Journal of Statistical Software , author =. doi:10.18637/jss.v050.i05 , language =
-
[69]
Proceedings of the American Mathematical Society , author =
Riemannian \. Proceedings of the American Mathematical Society , author =. 2011 , pages =. doi:10.1090/S0002-9939-2010-10541-5 , language =
-
[70]
SIAM Journal on Control and Optimization , author =
On the. SIAM Journal on Control and Optimization , author =. 2013 , pages =. doi:10.1137/12086282X , language =
-
[71]
IEEE Transactions on Pattern Analysis and Machine Intelligence , author =
Generalized. IEEE Transactions on Pattern Analysis and Machine Intelligence , author =. 2015 , pages =. doi:10.1109/TPAMI.2014.2353625 , number =
-
[72]
Aggarwal, G. and Chowdhury, A.K.R. and Chellappa, R. , year =. A system identification approach for video-based face recognition , isbn =. Proceedings of the 17th. doi:10.1109/ICPR.2004.1333732 , urldate =
-
[73]
Barycenters in the. SIAM Journal on Mathematical Analysis , author =. 2011 , pages =. doi:10.1137/100805741 , language =
-
[74]
Akaike, Hirotogu , editor =. Information. 1998 , pages =. doi:10.1007/978-1-4612-1694-0_15 , booktitle =
-
[75]
Data Mining and Knowledge Discovery , author =
Extreme-value-theoretic estimation of local intrinsic dimensionality , volume =. Data Mining and Knowledge Discovery , author =. 2018 , pages =. doi:10.1007/s10618-018-0578-6 , number =
-
[76]
Zico , editor =
Amos, Brandon and Xu, Lei and Kolter, J. Zico , editor =. Input convex neural networks , volume =. Proceedings of the 34th international conference on machine learning , publisher =. 2017 , pages =
2017
-
[77]
The quest for identifiability in human functional connectomes , volume =. Scientific Reports , author =. 2018 , pages =. doi:10.1038/s41598-018-25089-1 , language =
-
[78]
Gradient flows: in metric spaces and in the space of probability measures , isbn =
Ambrosio, Luigi and Gigli, Nicola and Savaré, Giuseppe , year =. Gradient flows: in metric spaces and in the space of probability measures , isbn =
-
[79]
Ambrosio, Luigi and Brué, Elia and Semola, Daniele , year =. Lectures on. doi:10.1007/978-3-72162-6 , language =
-
[80]
2003 , keywords =
Optimal transportation and applications: lectures given at the. 2003 , keywords =
2003
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.