I-BBS: Coordinate-Free Inference of Latent Sub-Manifolds Using Random Distance Matrix Theory

Igor Halperin

arxiv: 2606.29675 · v1 · pith:5PIKWGU6new · submitted 2026-06-29 · 💻 cs.LG · cond-mat.dis-nn· stat.ML

I-BBS: Coordinate-Free Inference of Latent Sub-Manifolds Using Random Distance Matrix Theory

Igor Halperin This is my paper

Pith reviewed 2026-06-30 07:01 UTC · model grok-4.3

classification 💻 cs.LG cond-mat.dis-nnstat.ML

keywords latent sub-manifoldsdistance matrix spectrumeigenvalue multipletscoordinate-free inferencemanifold dimensionnoise robustnessrandom matrix theory

0 comments

The pith

The dimension of a latent sub-manifold is recovered from the multiplicity of a top eigenvalue multiplet in its distance matrix.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops I-BBS to recover the dimension and geometry of a latent low-dimensional sub-manifold from the pairwise distance matrix of points in a high-dimensional ambient space, without needing coordinates or full access to that space. It models the mixing of manifold signal with off-manifold components using two classes of generative noise and shows that the eigenvalues reorganize collectively, so geometry is read from two surviving integer-stable signatures instead of individual eigenvalues. The multiplicity of the top non-Perron multiplet fixes the manifold dimension d, and a parameter-free law governs how those multiplet positions shrink as noise grows. These signatures prove more stable than the continuous spectral slope on synthetic spheres of dimensions one through three, and a blind test recovers both the manifold and the noise model from one matrix.

Core claim

Bogomolny, Bohigas and Schmit observed that the spectrum of the pairwise distance matrix on N points sampled from a smooth d-dimensional manifold encodes a signature of the underlying geometry. I-BBS recovers the latent geometry from the multiplicity of the top non-Perron multiplet, which fixes d, and a parameter-free law for the shrinkage of these multiplet positions as noise increases, even when the two generative noise classes cause collective reorganization of the eigenvalues.

What carries the argument

The top non-Perron multiplet in the eigenvalue spectrum of the noisy distance matrix, whose multiplicity fixes the manifold dimension d and whose positions follow a parameter-free shrinkage law under increasing noise.

If this is right

The integer signatures are far more stable under noise than the continuous spectral slope on synthetic spheres S1, S2 and S3.
A blind test recovers both the manifold and the noise model from a single distance matrix.
The method applies directly to neural-network representations and the dynamic training regime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could extend to any data source that supplies only pairwise distances, such as similarity graphs or partial observations.
Verification on non-spherical manifolds would test whether the multiplet signatures generalize beyond the synthetic spheres used.
The parameter-free shrinkage law might link to broader properties of random distance matrices independent of the specific noise models.

Load-bearing premise

The two generative noise classes sufficiently capture how real data mixes latent manifold signal with off-manifold components so that the claimed integer signatures remain identifiable and stable.

What would settle it

Finding that the multiplicity of the top non-Perron multiplet fails to match the known dimension d or changes with noise level on a synthetic sphere under either noise model would falsify the recovery claim.

Figures

Figures reproduced from arXiv: 2606.29675 by Igor Halperin.

**Figure 1.** Figure 1: contrasts the two: on S 2 the delocalised slope follows the BBS prediction −d/(d − 1) with the SO(3) multiplet plateaus 1, 3, 5, . . . at the top, while on S 127 the spectrum is a single dominant, only weakly decaying band of the largest ∼ 128 eigenvalues (the ℓ = 1 multiplet, spread out in the MP regime) ending in a sharp drop, the delocalised window [2, √ N] not reaching past this first multiplet (√ N ≈ … view at source ↗

**Figure 2.** Figure 2: Empirical spectral density of the negative eigenvalues of the geodesic matrix (Perron [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Angular-momentum-level shrinkage fℓ(ε) for degrees ℓ = 1, 3, 5: parameter-free prediction (19) (lines) and simulated multiplet shrinkage (markers) on the S 2/S124 setup. 0 1 2 3 4 5 latent degree r 10 21 10 18 10 15 10 12 10 9 10 6 10 3 10 0 |ars( )| (1,0) dimension cluster (a) sector map, = 0.32, D d = 125 s=0 s=1 s=2 s=3 s=4 0.1 0.2 0.3 0.4 0.5 0.6 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 m argin |a1… view at source ↗

**Figure 4.** Figure 4: Product-kernel spectrum of the RSM operator [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗

**Figure 5.** Figure 5: ESDs of the latent M(d) (blue), residual M(D−d) (red), and RSM ambient M(D) at ε ∈ {0.10, 0.22, 0.32, 0.45} (ηcos ≈ {1, 5, 10, 20}%, Eq. (21)) for a uniform reference (top-left) and beta-product concentrated single-particle densities at 40%, 35%, 30%, 25%, and 20% sphere coverage (sharper beta peaks, deeper ergodicity breaking; the per-sphere sharpness κlat, κres is shown in each panel). N = 1000, d = 3, D… view at source ↗

**Figure 6.** Figure 6: Multiplicity recovery P(hˆ 1 = h(1, d)) vs noise η for RSM and FSM on S 1 , S2 , S3 (N = 1000, D = 128, 20 realisations, detector threshold τ = 0.30). RSM recovers it up to η = 0.80; FSM up to η = 0.50, then fails when its inter-multiplet gap closes (Sec. 5.2). 5.2 Eigenvalue stability and the shrinkage law The multiplicity is an integer; the eigenvalues that carry it are not. We track the mean magnitude o… view at source ↗

**Figure 7.** Figure 7: Top: ℓ = 1 multiplet shrinkage |Λ (D) ℓ=1|/|Λ (d) ℓ=1| vs η for RSM and FSM on S 1 , S2 , S3 , with the Funk–Hecke prediction f1 (dashed); RSM lies on it, FSM shrinks more slowly. The green curve is the quasi-degenerate block reduction (Pymablock [43], Appendix E) of the bounded spliced mean drift; it tracks RSM and f1 across the full range on all three manifolds. Bottom: the log inter-multiplet gap with t… view at source ↗

**Figure 8.** Figure 8: Blind identification from a single M(D) , pooled over 0.05 ≤ η ≤ 0.50 and S 1–S 3 . (a) Manifold confusion from the gap-protected multiplicity (identity). (b) Blind ℓ = 2 component ratio |λˆ 2/λˆ 1| from the top cos M(D) eigenvectors: RSM at the parity floor, FSM rising with η, separated by a geometric-mean boundary (dashed). (c) The resulting noise-model confusion matrix. 5.4 Statistical validation: null … view at source ↗

**Figure 9.** Figure 9: Statistical validation of the multiplicity detector with [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗

**Figure 10.** Figure 10: Anisotropic-sampling negative control on [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗

**Figure 11.** Figure 11: c). The same sensitivity makes the localised branch the sharpest fingerprint of the noise mechanism: the isotropic RSM and the anisotropic, ℓ = 2-injecting FSM fill the small-|Λ| region with visibly different shapes, FSM flattening it toward a Wigner-like plateau (see Fig. 11d). This is the residual random-matrix discrimination of Appendix C.5, read directly off the localised branch. 10 1 10 2 10 3 rank K… view at source ↗

**Figure 12.** Figure 12: RSM forward drift (left) and volatility (right) correction prefactors versus [PITH_FULL_IMAGE:figures/full_fig_p032_12.png] view at source ↗

**Figure 13.** Figure 13: Convergence of the angular-momentum Gegenbauer expansion of the RSM drift (one fixed [PITH_FULL_IMAGE:figures/full_fig_p035_13.png] view at source ↗

**Figure 14.** Figure 14: Empirical spectral densities of the negative eigenvalues of the geodesic distance matrix [PITH_FULL_IMAGE:figures/full_fig_p037_14.png] view at source ↗

**Figure 15.** Figure 15: Signal limit: the top eigenvalues |ΛK|/N of the stored matrices (N = 1000, markers with error bars over 20 realisations) against the operator spectrum eig(Tp¯) from an independent M = 4000 sample (lines), on S 2 (left) and S 124 (right). The four single-particle patterns are reproduced by their respective p¯; the per-angle pattern (sc3) shows the larger scatter expected of a quenched anisotropy. C.3 Nonun… view at source ↗

**Figure 16.** Figure 16: shows the S 2 rank plot: the operator counting tracks the simulated |ΛK| and bends from a steeper small-K slope toward the βBBS reference at larger K. The counting route (C.6) and the resolvent (see Sec. C.4) are not independent of each other, both built on the same operator spectrum; the independent check is this operator prediction against the direct diagonalisation, which it matches [PITH_FULL_IMAGE:f… view at source ↗

**Figure 17.** Figure 17: Closed-form bulk density from the deformed-MP self-consistent equation [PITH_FULL_IMAGE:figures/full_fig_p043_17.png] view at source ↗

**Figure 18.** Figure 18: Free multiplicative deconvolution of the operator spectrum (the inverse of the Silverstein [PITH_FULL_IMAGE:figures/full_fig_p044_18.png] view at source ↗

**Figure 19.** Figure 19: Residual-spectrum RMT consistency on S 2 (N = 1000, D − d = 125). Top row: oracle residual R = M(D) − M(d) . Bottom row: operational residual Rˆ = M(D) − Mˆ (d) from (C.16) with Klat = ⌊ √ N⌋. Columns are ε ∈ {0.10, 0.22, 0.32, 0.45}. The dashed curve is the exploratory semicircle reference (C.17) at the residual’s own entry variance; the annotations give the ratio of the latent eigenvalue |Λ (d) | to the… view at source ↗

**Figure 20.** Figure 20: Finite-N delocalised-slope bias |βˆ deloc − βBBS| versus N for the geodesic matrix on S 1 , S2 , S3 (uniform and product-Beta sampling, two rank windows). The black line is the deterministic operator-only bias from the ranked {N|aℓ |} (no noise): it accounts for most of the drift, the full random-matrix curves lying close to it. The curves are nearly parallel across the uniform and product-Beta sampling … view at source ↗

**Figure 21.** Figure 21: Delocalised-branch slope bias |∆β(N, d)| vs N on latent samples of S d−1 . Solid lines: power-law fits with a shallow exponent ≈ 0.2–0.3 for d = 2, 3, 4, tracking the deterministic operatorwindow drift of Sec. D.1; the rate is shallower than the N −1/2 of a noise-only estimate. Applied to ambient data, the corrected slope matches the BBS asymptote at ε = 0 but drifts monotonically downward under noise: o… view at source ↗

**Figure 22.** Figure 22: Quasi-degenerate perturbation theory (Pymablock [43]) against the full simulation. (a) [PITH_FULL_IMAGE:figures/full_fig_p050_22.png] view at source ↗

read the original abstract

Bogomolny, Bohigas and Schmit (BBS) found that the spectrum of the pairwise distance matrix on N points sampled from a smooth d-dimensional manifold encodes a signature of the underlying geometry. We develop I-BBS (Inference-BBS), a coordinate-free method that identifies a low-dimensional latent sub-manifold embedded in a high-dimensional ambient distance matrix alone, without accessing an ambient high-dimensional vector space. It therefore applies even when that space is only partly observable or undefined. We model the ambient embedding by two classes of generative noise, model-based and model-free. The noise mixes the latent signal with off-manifold components, so the eigenvalues reorganise collectively and the latent geometry cannot be read off eigenvalue by eigenvalue. We recover it instead from two integer-stable signatures that survive the noise: the multiplicity of the top non-Perron multiplet, which fixes $d$, and a parameter-free law for how the multiplet positions shrink as the noise grows. On synthetic spheres $S^1$, $S^2$ and $S^3$ these integer signatures are far more stable under noise than the continuous spectral slope, and a blind test recovers both the manifold and the noise model from a single distance matrix. Applications to neural-network representations and to the dynamic training regime are developed in two companion papers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

I-BBS turns the BBS distance-matrix spectrum into an inference tool for latent d via multiplicity and a claimed parameter-free shrinkage under two noise models, but the support stays thin on derivations and real-data coverage.

read the letter

The core claim is that you can recover the dimension and some geometry of a latent manifold from an ambient distance matrix alone by tracking two stable integer signatures under modeled noise: the multiplicity of the top non-Perron multiplet fixes d, and a parameter-free shrinkage law tracks how those eigenvalues move as noise increases. They test this on synthetic spheres S1–S3 and report that a blind procedure recovers both the manifold and which noise class was used.

What is actually new is the explicit inference step that combines multiplicity with the shrinkage law under the two generative classes (model-based and model-free), plus the claim that these signatures survive collective eigenvalue reorganization better than a continuous spectral slope. The coordinate-free framing is useful for settings where the ambient vector space is only partly observed.

The synthetic stability results are the strongest part shown. The soft spots are the lack of any derivation or error analysis for the parameter-free law in the supplied text, and the assumption that the two noise classes are representative enough that the signatures stay identifiable in practice. If real off-manifold components have different correlation structure, the multiplet could split or lose clear multiplicity, and nothing in the abstract rules that out or quantifies the risk. The circularity concern is minor if the law is independently derived, but that cannot be checked yet.

This is for people working on manifold recovery from distances or on neural-net representation geometry. A reader already following spectral methods on point clouds would get the idea quickly. The work deserves a serious referee to see whether the full paper supplies the missing derivations and broader tests.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces I-BBS, a coordinate-free extension of BBS random distance matrix theory for inferring the dimension d and geometry of a latent d-dimensional sub-manifold embedded in an ambient distance matrix. It models the embedding via two generative noise classes (model-based and model-free), under which eigenvalues reorganize collectively; geometry is recovered from two integer-stable signatures—the multiplicity of the top non-Perron multiplet (fixing d) and a parameter-free shrinkage law for multiplet positions with increasing noise. Synthetic tests on spheres S¹–S³ demonstrate greater stability than continuous spectral slope, and a blind test recovers both manifold and noise model from a single distance matrix. Applications to neural-network representations are noted for companion papers.

Significance. If the claimed integer signatures prove robust and the two noise classes are representative, the method would provide a genuinely coordinate-free route to manifold dimension and geometry recovery from distance data alone, even when the ambient vector space is only partially observable. This would be a notable contribution to manifold learning and random-matrix applications in high-dimensional data analysis.

major comments (2)

[Abstract] The central claim that the two generative noise classes (model-based and model-free) sufficiently capture real-data mixing of latent manifold signal with off-manifold components, so that the integer signatures remain identifiable, is asserted in the abstract but not shown to be exhaustive. No demonstration is supplied that the multiplet multiplicity and shrinkage law survive other plausible perturbations (e.g., low-rank structured noise or manifold-dependent noise) that could split or shift the multiplet and render d unrecoverable.
[Abstract] The abstract describes the signatures and synthetic sphere tests but supplies no derivations, error analysis, or quantitative results; support for the parameter-free law and noise-model recovery therefore cannot be assessed.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their thoughtful review and constructive comments on our manuscript. We respond to each major comment below, indicating where revisions will be incorporated.

read point-by-point responses

Referee: [Abstract] The central claim that the two generative noise classes (model-based and model-free) sufficiently capture real-data mixing of latent manifold signal with off-manifold components, so that the integer signatures remain identifiable, is asserted in the abstract but not shown to be exhaustive. No demonstration is supplied that the multiplet multiplicity and shrinkage law survive other plausible perturbations (e.g., low-rank structured noise or manifold-dependent noise) that could split or shift the multiplet and render d unrecoverable.

Authors: We appreciate the referee drawing attention to the scope of our noise modeling. The manuscript introduces the model-based and model-free classes as two representative generative mechanisms under which the eigenvalues reorganize collectively, allowing recovery via the integer signatures; it does not assert that these classes are exhaustive or that they capture every possible real-data perturbation. The synthetic sphere experiments demonstrate stability of the multiplicity and shrinkage law specifically under the proposed noise classes. We agree that additional perturbations (such as low-rank structured noise or manifold-dependent noise) are not tested and could potentially affect identifiability. In revision we will update the abstract and add a dedicated limitations paragraph in the discussion to explicitly scope the claims to the two modeled noise classes and note that robustness to other perturbation types remains an open question for future work. revision: yes
Referee: [Abstract] The abstract describes the signatures and synthetic sphere tests but supplies no derivations, error analysis, or quantitative results; support for the parameter-free law and noise-model recovery therefore cannot be assessed.

Authors: The abstract is written as a high-level summary of the method, signatures, and key experimental outcomes. The derivations of the integer signatures and parameter-free shrinkage law, together with the error analysis, quantitative stability comparisons against spectral slope, and details of the blind noise-model recovery test, are all contained in the main body of the manuscript (theoretical sections and experimental results). Because the abstract's role is to provide an overview rather than technical detail, we do not plan to expand it with derivations or quantitative tables. revision: no

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained against external benchmarks

full rationale

The paper extends the established BBS spectral signature by introducing two explicit generative noise classes (model-based and model-free) to model off-manifold mixing, then identifies the multiplicity of the top non-Perron multiplet (fixing d) and a claimed parameter-free shrinkage law as the surviving integer signatures. No equations or fitting procedures are exhibited in the abstract or description that would reduce the shrinkage law to a fitted parameter renamed as prediction, nor is any self-citation load-bearing, uniqueness theorem imported from the authors, or ansatz smuggled via prior work. Validation occurs on synthetic spheres S^1–S^3 with blind recovery tests, which are independent of the target result rather than tautological. The central claim therefore rests on the modeling assumptions and empirical stability rather than reducing by construction to its inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is abstract-only; no explicit free parameters, invented entities, or additional axioms beyond the foundational BBS spectral observation are stated.

axioms (1)

domain assumption The spectrum of the pairwise distance matrix on N points sampled from a smooth d-dimensional manifold encodes a signature of the underlying geometry.
Invoked as the starting point for developing I-BBS inference under noise.

pith-pipeline@v0.9.1-grok · 5771 in / 1177 out tokens · 17832 ms · 2026-06-30T07:01:44.708622+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 8 canonical work pages · 6 internal anchors

[1]

Spectral properties of distance matrices

E. Bogomolny, O. Bohigas, and C. Schmit. Spectral properties of distance matrices.Journal of Physics A: Mathematical and General, 36:3595–3616, 2003. arXiv:nlin/0301044

work page internal anchor Pith review Pith/arXiv arXiv 2003
[2]

Distance matrices and isometric embeddings

E. Bogomolny, O. Bohigas, and C. Schmit. Distance matrices and isometric embeddings. arXiv:0710.2063, 2007

work page internal anchor Pith review Pith/arXiv arXiv 2063
[3]

Halperin.Learning as Observable Matrix Dynamics: Diffusive Relaxations versus Phase Transitions.2026

I. Halperin.Learning as Observable Matrix Dynamics: Diffusive Relaxations versus Phase Transitions.2026

2026
[4]

Halperin.Grokking as Bagel Formation in Activation Space: Spectral Evidence for a Phase Transition.2026

I. Halperin.Grokking as Bagel Formation in Activation Space: Spectral Evidence for a Phase Transition.2026

2026
[5]

Bun, J.-P

J. Bun, J.-P. Bouchaud, and M. Potters. Cleaning large correlation matrices: tools from Random Matrix Theory.Physics Reports, 666:1–109, 2017

2017
[6]

Levina and P

E. Levina and P. J. Bickel. Maximum likelihood estimation of intrinsic dimension.Advances in Neural Information Processing Systems (NeurIPS), 2004

2004
[7]

Facco, M

E. Facco, M. d’Errico, A. Rodriguez, and A. Laio. Estimating the intrinsic dimension of datasets by a minimal neighborhood information.Scientific Reports, 7(1):12140, 2017

2017
[8]

Chazal and B

F. Chazal and B. Michel. An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists.Frontiers in Artificial Intelligence, 4:667963, 2021

2021
[9]

Otter, M

N. Otter, M. A. Porter, U. Tillmann, P. Grindrod, and H. A. Harrington. A roadmap for the computation of persistent homology.EPJ Data Science, 6(1):17, 2017

2017
[10]

R. R. Coifman and S. Lafon. Diffusion maps.Applied and Computational Harmonic Analysis, 21(1):5–30, 2006

2006
[11]

El Karoui.The spectrum of kernel random matrices

N. El Karoui.The spectrum of kernel random matrices. Annals of Statistics38(1):1–50, 2010

2010
[12]

On Euclidean random matrices in high dimension

C. Bordenave.On Euclidean random matrices in high dimension.arXiv:1209.5888, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012
[13]

Couillet and Z

R. Couillet and Z. Liao.Random Matrix Methods for Machine Learning. Cambridge University Press, 2022

2022
[14]

S. Lele. Euclidean Distance Matrix Analysis (EDMA): Estimation of Mean Form and Mean Form Difference.Mathematical Geology, 25(5):573–602, 1993

1993
[15]

M´ ezard, G

M. M´ ezard, G. Parisi, and A. Zee. Spectra of Euclidean random matrices.Nuclear Physics B, 559(3):689–710, 1999

1999
[16]

Largest eigenvalue and top eigenvector statistics of large Euclidean random matrices

P. Casaburi and P. Vivo.Largest eigenvalue and top eigenvector statistics of large Euclidean random matrices.arXiv:2604.26852, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[17]

Casaburi and P

P. Casaburi and P. Vivo.Replica approach to extreme eigenvalues of Euclidean random matrices. Journal of Physics A: Mathematical and Theoretical, 2026

2026
[18]

El Karoui

N. El Karoui. Spectrum estimation for large dimensional covariance matrices using random matrix theory.The Annals of Statistics, 36(6):2757–2790, 2008. 51

2008
[19]

Chen and R

H. Chen and R. Ma.Statistical Inference for Manifold Similarity and Alignability across Noisy High-Dimensional Datasets.arXiv:2511.21074, 2025

work page arXiv 2025
[20]

Rahimi and B

A. Rahimi and B. Recht. Random features for large-scale kernel machines.Advances in Neural Information Processing Systems (NIPS), 2007

2007
[21]

F. Liu, X. Huang, Y. Chen, and J. A. K. Suykens.Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7128–7148, 2022 (arXiv:2004.11154, 2021)

work page arXiv 2022
[22]

D. Paul. Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17(4):1617–1642, 2007

2007
[23]

Donoho and M

D. Donoho and M. Gavish. Minimax risk of matrix denoising by singular value thresholding. The Annals of Statistics, 42(6):2413–2440, 2014

2014
[24]

Y. Yan, Y. Chen, and J. Fan. Inference for heteroskedastic PCA with missing data.The Annals of Statistics, 52(2):729–756, 2024

2024
[25]

Ding and R

X. Ding and R. Ma. Learning low-dimensional nonlinear structures from high-dimensional noisy data: an integral operator approach.The Annals of Statistics, 51(4):1744–1769, 2023

2023
[26]

K. V. Mardia and P. E. Jupp.Directional Statistics. John Wiley & Sons, 2nd edition, 2000

2000
[27]

Wu and N

H.-T. Wu and N. Wu. Think globally, fit locally under the manifold setup: asymptotic analysis of locally linear embedding.The Annals of Statistics, 46(6B):3805–3837, 2018

2018
[28]

J. Li. Asymptotic normality of interpoint distances for high-dimensional data with applications to the two-sample problem.Biometrika, 105(3):529–546, 2018

2018
[29]

W. Li, Q. Wang, and J. Yao. Eigenvalue distribution of a high-dimensional distance covariance matrix with application.Statistica Sinica, 33(1):149–168, 2023

2023
[30]

Meil˘ a and H

M. Meil˘ a and H. Zhang. Manifold learning: what, how, and why.Annual Review of Statistics and Its Application, 11:393–417, 2024

2024
[31]

Ding and R

X. Ding and R. Ma. Kernel spectral joint embeddings for high-dimensional noisy datasets using duo-landmark integral operators.Journal of the American Statistical Association, 1–28, 2025

2025
[32]

Smale and D.-X

S. Smale and D.-X. Zhou. Geometry on probability spaces.Constructive Approximation, 30(3):311–323, 2009

2009
[33]

Erd´ elyi, W

A. Erd´ elyi, W. Magnus, F. Oberhettinger, and F. G. Tricomi.Higher Transcendental Functions, Vol. II(Bateman Manuscript Project). McGraw-Hill, 1953

1953
[34]

Marsaglia and I

G. Marsaglia and I. Olkin. Generating correlation matrices.SIAM Journal on Scientific and Statistical Computing, 5(2):470–475, 1984

1984
[35]

I. J. Schoenberg. Positive definite functions on spheres.Duke Math. J.9(1942), 96–108

1942
[36]

J. Baik, G. Ben Arous, and S. P´ ech´ e. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.The Annals of Probability, 33(5):1643–1697, 2005

2005
[37]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices.Advances in Mathematics, 227(1):494–521, 2011. 52

2011
[38]

Pastur and V

L. Pastur and V. Vasilchuk. On the law of addition of random matrices.Communications in Mathematical Physics, 214:249–286, 2000

2000
[39]

Zee.Law of addition in random matrix theory

A. Zee.Law of addition in random matrix theory. Nuclear Physics B, 474(3):726–744, 1996

1996
[40]

Euclidean random matrices: solved and open problems

G. Parisi.Euclidean random matrices: solved and open problems.InApplications of Random Matrices in Physics, NATO Science Series II, vol. 221, Springer, 2006 (arXiv:cond-mat/0512004)

work page internal anchor Pith review Pith/arXiv arXiv 2006
[41]

L. D. Landau and E. M. Lifshitz.Quantum Mechanics: Non-Relativistic Theory(Course of Theoretical Physics, Vol. 3). Pergamon Press, 3rd edition, 1977 (§38–39)

1977
[42]

Hose and U

G. Hose and U. Kaldor. Quasidegenerate perturbation theory.The Journal of Physical Chemistry, 86(12):2133–2140, 1982

1982
[43]

Araya Day, S

I. Araya Day, S. Miles, H. K. Kerstens, D. Varjas, and A. R. Akhmerov. Pymablock: An algorithm and a package for quasi-degenerate perturbation theory.SciPost Physics Codebases, 50, 2025

2025
[44]

L¨ owdin

P.-O. L¨ owdin. Studies in perturbation theory. IV. Solution of eigenvalue problem by projection operator formalism.Journal of Mathematical Physics, 3(5):969–982, 1962

1962
[45]

J. R. Schrieffer and P. A. Wolff. Relation between the Anderson and Kondo Hamiltonians. Physical Review, 149(2):491–492, 1966

1966
[46]

A. J. Smola, Z. L. ´Ov´ ari, and R. C. Williamson. Regularization with dot-product kernels. In Advances in Neural Information Processing Systems 13 (NIPS 2000), pages 308–314. MIT Press, 2001

2000
[47]

Davis and W

C. Davis and W. M. Kahan. The rotation of eigenvectors by a perturbation. III.SIAM Journal on Numerical Analysis, 7(1):1–46, 1970

1970
[48]

Helgason.Groups and Geometric Analysis: Integral Geometry, Invariant Differential Opera- tors, and Spherical Functions.Academic Press, 1984

S. Helgason.Groups and Geometric Analysis: Integral Geometry, Invariant Differential Opera- tors, and Spherical Functions.Academic Press, 1984

1984
[49]

Azevedo and V

D. Azevedo and V. S. Barbosa. Covering numbers of isotropic reproducing kernels on compact two-point homogeneous spaces.Mathematische Nachrichten, 291(1):1–15, 2018

2018
[50]

Euclidean random matrices and their applications in physics

A. Goetschy and S. E. Skipetrov. Euclidean random matrices and their applications in physics. arXiv:1303.2880, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[51]

V. N. Stepanov. The method of spherical harmonics for integral transforms on a sphere. Mathematical Structures and Modeling, 2(42):36–48, 2017

2017
[52]

J. A. Mingo and R. Speicher.Free Probability and Random Matrices. Fields Institute Mono- graphs, Springer, 2017

2017
[53]

Z. D. Bai and J. W. Silverstein.Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, 2010

2010
[54]

G. W. Anderson, A. Guionnet, and O. Zeitouni.An Introduction to Random Matrices. Cambridge University Press, 2010. 53

2010

[1] [1]

Spectral properties of distance matrices

E. Bogomolny, O. Bohigas, and C. Schmit. Spectral properties of distance matrices.Journal of Physics A: Mathematical and General, 36:3595–3616, 2003. arXiv:nlin/0301044

work page internal anchor Pith review Pith/arXiv arXiv 2003

[2] [2]

Distance matrices and isometric embeddings

E. Bogomolny, O. Bohigas, and C. Schmit. Distance matrices and isometric embeddings. arXiv:0710.2063, 2007

work page internal anchor Pith review Pith/arXiv arXiv 2063

[3] [3]

Halperin.Learning as Observable Matrix Dynamics: Diffusive Relaxations versus Phase Transitions.2026

I. Halperin.Learning as Observable Matrix Dynamics: Diffusive Relaxations versus Phase Transitions.2026

2026

[4] [4]

Halperin.Grokking as Bagel Formation in Activation Space: Spectral Evidence for a Phase Transition.2026

I. Halperin.Grokking as Bagel Formation in Activation Space: Spectral Evidence for a Phase Transition.2026

2026

[5] [5]

Bun, J.-P

J. Bun, J.-P. Bouchaud, and M. Potters. Cleaning large correlation matrices: tools from Random Matrix Theory.Physics Reports, 666:1–109, 2017

2017

[6] [6]

Levina and P

E. Levina and P. J. Bickel. Maximum likelihood estimation of intrinsic dimension.Advances in Neural Information Processing Systems (NeurIPS), 2004

2004

[7] [7]

Facco, M

E. Facco, M. d’Errico, A. Rodriguez, and A. Laio. Estimating the intrinsic dimension of datasets by a minimal neighborhood information.Scientific Reports, 7(1):12140, 2017

2017

[8] [8]

Chazal and B

F. Chazal and B. Michel. An introduction to Topological Data Analysis: fundamental and practical aspects for data scientists.Frontiers in Artificial Intelligence, 4:667963, 2021

2021

[9] [9]

Otter, M

N. Otter, M. A. Porter, U. Tillmann, P. Grindrod, and H. A. Harrington. A roadmap for the computation of persistent homology.EPJ Data Science, 6(1):17, 2017

2017

[10] [10]

R. R. Coifman and S. Lafon. Diffusion maps.Applied and Computational Harmonic Analysis, 21(1):5–30, 2006

2006

[11] [11]

El Karoui.The spectrum of kernel random matrices

N. El Karoui.The spectrum of kernel random matrices. Annals of Statistics38(1):1–50, 2010

2010

[12] [12]

On Euclidean random matrices in high dimension

C. Bordenave.On Euclidean random matrices in high dimension.arXiv:1209.5888, 2012

work page internal anchor Pith review Pith/arXiv arXiv 2012

[13] [13]

Couillet and Z

R. Couillet and Z. Liao.Random Matrix Methods for Machine Learning. Cambridge University Press, 2022

2022

[14] [14]

S. Lele. Euclidean Distance Matrix Analysis (EDMA): Estimation of Mean Form and Mean Form Difference.Mathematical Geology, 25(5):573–602, 1993

1993

[15] [15]

M´ ezard, G

M. M´ ezard, G. Parisi, and A. Zee. Spectra of Euclidean random matrices.Nuclear Physics B, 559(3):689–710, 1999

1999

[16] [16]

Largest eigenvalue and top eigenvector statistics of large Euclidean random matrices

P. Casaburi and P. Vivo.Largest eigenvalue and top eigenvector statistics of large Euclidean random matrices.arXiv:2604.26852, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[17] [17]

Casaburi and P

P. Casaburi and P. Vivo.Replica approach to extreme eigenvalues of Euclidean random matrices. Journal of Physics A: Mathematical and Theoretical, 2026

2026

[18] [18]

El Karoui

N. El Karoui. Spectrum estimation for large dimensional covariance matrices using random matrix theory.The Annals of Statistics, 36(6):2757–2790, 2008. 51

2008

[19] [19]

Chen and R

H. Chen and R. Ma.Statistical Inference for Manifold Similarity and Alignability across Noisy High-Dimensional Datasets.arXiv:2511.21074, 2025

work page arXiv 2025

[20] [20]

Rahimi and B

A. Rahimi and B. Recht. Random features for large-scale kernel machines.Advances in Neural Information Processing Systems (NIPS), 2007

2007

[21] [21]

F. Liu, X. Huang, Y. Chen, and J. A. K. Suykens.Random Features for Kernel Approximation: A Survey on Algorithms, Theory, and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(10):7128–7148, 2022 (arXiv:2004.11154, 2021)

work page arXiv 2022

[22] [22]

D. Paul. Asymptotics of sample eigenstructure for a large dimensional spiked covariance model. Statistica Sinica, 17(4):1617–1642, 2007

2007

[23] [23]

Donoho and M

D. Donoho and M. Gavish. Minimax risk of matrix denoising by singular value thresholding. The Annals of Statistics, 42(6):2413–2440, 2014

2014

[24] [24]

Y. Yan, Y. Chen, and J. Fan. Inference for heteroskedastic PCA with missing data.The Annals of Statistics, 52(2):729–756, 2024

2024

[25] [25]

Ding and R

X. Ding and R. Ma. Learning low-dimensional nonlinear structures from high-dimensional noisy data: an integral operator approach.The Annals of Statistics, 51(4):1744–1769, 2023

2023

[26] [26]

K. V. Mardia and P. E. Jupp.Directional Statistics. John Wiley & Sons, 2nd edition, 2000

2000

[27] [27]

Wu and N

H.-T. Wu and N. Wu. Think globally, fit locally under the manifold setup: asymptotic analysis of locally linear embedding.The Annals of Statistics, 46(6B):3805–3837, 2018

2018

[28] [28]

J. Li. Asymptotic normality of interpoint distances for high-dimensional data with applications to the two-sample problem.Biometrika, 105(3):529–546, 2018

2018

[29] [29]

W. Li, Q. Wang, and J. Yao. Eigenvalue distribution of a high-dimensional distance covariance matrix with application.Statistica Sinica, 33(1):149–168, 2023

2023

[30] [30]

Meil˘ a and H

M. Meil˘ a and H. Zhang. Manifold learning: what, how, and why.Annual Review of Statistics and Its Application, 11:393–417, 2024

2024

[31] [31]

Ding and R

X. Ding and R. Ma. Kernel spectral joint embeddings for high-dimensional noisy datasets using duo-landmark integral operators.Journal of the American Statistical Association, 1–28, 2025

2025

[32] [32]

Smale and D.-X

S. Smale and D.-X. Zhou. Geometry on probability spaces.Constructive Approximation, 30(3):311–323, 2009

2009

[33] [33]

Erd´ elyi, W

A. Erd´ elyi, W. Magnus, F. Oberhettinger, and F. G. Tricomi.Higher Transcendental Functions, Vol. II(Bateman Manuscript Project). McGraw-Hill, 1953

1953

[34] [34]

Marsaglia and I

G. Marsaglia and I. Olkin. Generating correlation matrices.SIAM Journal on Scientific and Statistical Computing, 5(2):470–475, 1984

1984

[35] [35]

I. J. Schoenberg. Positive definite functions on spheres.Duke Math. J.9(1942), 96–108

1942

[36] [36]

J. Baik, G. Ben Arous, and S. P´ ech´ e. Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices.The Annals of Probability, 33(5):1643–1697, 2005

2005

[37] [37]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi. The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices.Advances in Mathematics, 227(1):494–521, 2011. 52

2011

[38] [38]

Pastur and V

L. Pastur and V. Vasilchuk. On the law of addition of random matrices.Communications in Mathematical Physics, 214:249–286, 2000

2000

[39] [39]

Zee.Law of addition in random matrix theory

A. Zee.Law of addition in random matrix theory. Nuclear Physics B, 474(3):726–744, 1996

1996

[40] [40]

Euclidean random matrices: solved and open problems

G. Parisi.Euclidean random matrices: solved and open problems.InApplications of Random Matrices in Physics, NATO Science Series II, vol. 221, Springer, 2006 (arXiv:cond-mat/0512004)

work page internal anchor Pith review Pith/arXiv arXiv 2006

[41] [41]

L. D. Landau and E. M. Lifshitz.Quantum Mechanics: Non-Relativistic Theory(Course of Theoretical Physics, Vol. 3). Pergamon Press, 3rd edition, 1977 (§38–39)

1977

[42] [42]

Hose and U

G. Hose and U. Kaldor. Quasidegenerate perturbation theory.The Journal of Physical Chemistry, 86(12):2133–2140, 1982

1982

[43] [43]

Araya Day, S

I. Araya Day, S. Miles, H. K. Kerstens, D. Varjas, and A. R. Akhmerov. Pymablock: An algorithm and a package for quasi-degenerate perturbation theory.SciPost Physics Codebases, 50, 2025

2025

[44] [44]

L¨ owdin

P.-O. L¨ owdin. Studies in perturbation theory. IV. Solution of eigenvalue problem by projection operator formalism.Journal of Mathematical Physics, 3(5):969–982, 1962

1962

[45] [45]

J. R. Schrieffer and P. A. Wolff. Relation between the Anderson and Kondo Hamiltonians. Physical Review, 149(2):491–492, 1966

1966

[46] [46]

A. J. Smola, Z. L. ´Ov´ ari, and R. C. Williamson. Regularization with dot-product kernels. In Advances in Neural Information Processing Systems 13 (NIPS 2000), pages 308–314. MIT Press, 2001

2000

[47] [47]

Davis and W

C. Davis and W. M. Kahan. The rotation of eigenvectors by a perturbation. III.SIAM Journal on Numerical Analysis, 7(1):1–46, 1970

1970

[48] [48]

Helgason.Groups and Geometric Analysis: Integral Geometry, Invariant Differential Opera- tors, and Spherical Functions.Academic Press, 1984

S. Helgason.Groups and Geometric Analysis: Integral Geometry, Invariant Differential Opera- tors, and Spherical Functions.Academic Press, 1984

1984

[49] [49]

Azevedo and V

D. Azevedo and V. S. Barbosa. Covering numbers of isotropic reproducing kernels on compact two-point homogeneous spaces.Mathematische Nachrichten, 291(1):1–15, 2018

2018

[50] [50]

Euclidean random matrices and their applications in physics

A. Goetschy and S. E. Skipetrov. Euclidean random matrices and their applications in physics. arXiv:1303.2880, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[51] [51]

V. N. Stepanov. The method of spherical harmonics for integral transforms on a sphere. Mathematical Structures and Modeling, 2(42):36–48, 2017

2017

[52] [52]

J. A. Mingo and R. Speicher.Free Probability and Random Matrices. Fields Institute Mono- graphs, Springer, 2017

2017

[53] [53]

Z. D. Bai and J. W. Silverstein.Spectral Analysis of Large Dimensional Random Matrices, 2nd ed. Springer, 2010

2010

[54] [54]

G. W. Anderson, A. Guionnet, and O. Zeitouni.An Introduction to Random Matrices. Cambridge University Press, 2010. 53

2010