Detecting non-uniform patterns on high-dimensional hyperspheres
Pith reviewed 2026-05-19 12:08 UTC · model grok-4.3
The pith
A distance from pairwise inner products on hyperspheres captures minimax rates for uniformity testing across high-dimensional models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The Ingster-type distance derived from the distribution of pairwise inner products captures the minimax rates for testing uniformity simultaneously across several high-dimensional parametric models, even in models where densities with respect to the uniform law do not exist.
What carries the argument
Ingster-type distance constructed from the distribution of pairwise inner products on the hypersphere, enabling systematic Edgeworth-type asymptotic analysis.
If this is right
- The test is universally consistent in fixed dimensions.
- The test is minimax-optimal over a variety of high-dimensional parametric models.
- The test is consistent against non-local high-dimensional alternatives.
- The test provides local asymptotic distributions under the considered alternatives and new information lower bounds.
Where Pith is reading between the lines
- The same characterization might be adapted to test uniformity on other rotationally symmetric manifolds.
- The approach could connect to problems of testing isotropy in directional statistics without requiring density assumptions.
- Extensions to dependent observations or manifold-valued data could be explored using similar pairwise product statistics.
Load-bearing premise
The proposed probabilistic characterization of the uniform distribution on the hypersphere in terms of the distribution of pairwise inner products holds.
What would settle it
An explicit high-dimensional parametric model where the test based on this distance fails to attain the known minimax rate for detecting deviations from uniformity.
read the original abstract
We propose a new probabilistic characterization of the uniform distribution on the hypersphere in terms of the distribution of pairwise inner products, extending the ideas of \citep{cuesta2009projection,cuesta2007sharp} in a data-driven manner. This characterization naturally leads to an Ingster-type distance for quantifying deviations from uniformity, whose asymptotic behavior can be analyzed systematically via Edgeworth-type expansions. Perhaps surprisingly, we show that this distance captures the minimax rates for testing uniformity simultaneously across several high-dimensional parametric models, even in the models where densities with respect to the uniform law do not exist. We then introduce a simple test for spherical uniformity based on this distance and study its detection rates and consistency against various classes of alternatives, both local and non-local. The proposed test is universally consistent in fixed dimensions, minimax-optimal over a variety of high-dimensional parametric models, and consistent against non-local high-dimensional alternatives. This is different from previously studied high-dimensional Sobolev tests and extreme-value-based tests, which are rate-suboptimal or inconsistent against one or more classes of alternatives. We also establish the local asymptotic distribution of the proposed test under the considered classes of alternatives, along with new information lower bounds.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a probabilistic characterization of the uniform distribution on the hypersphere in terms of the law of pairwise inner products, extending prior projection ideas. This characterization yields an Ingster-type distance whose asymptotics are analyzed via Edgeworth expansions. The central claims are that this distance simultaneously attains the minimax rates for testing uniformity over several high-dimensional parametric models (including singular ones without densities w.r.t. the uniform measure), that a simple test based on the distance is consistent and rate-optimal, and that matching information lower bounds and local asymptotic distributions can be derived.
Significance. If the derivations hold, the work would be significant for providing a unified, data-driven approach to spherical uniformity testing that achieves optimality across multiple models where density-based methods fail. Credit is given for the explicit construction of both upper and lower bounds, the analysis of local and non-local alternatives, and the extension of projection methods to enable systematic Edgeworth analysis.
major comments (2)
- [§4.2] §4.2, the Edgeworth expansion for the normalized test statistic under local alternatives: the regularity conditions stated (finite moments and characteristic-function smoothness) do not obviously extend to the singular parametric models in which the induced law on inner products is discrete or lower-dimensional; this risks invalidating the remainder control used to obtain the precise detection boundaries.
- [§5.1] §5.1, Theorem 5.3: the claim that the distance captures the minimax rate simultaneously for all listed models relies on reducing the lower-bound construction to the inner-product marginal; the reduction step needs an explicit least-favorable sequence for each singular model to confirm that no rate gap is introduced by the characterization.
minor comments (2)
- [§2] Notation for the hypersphere S^{d-1} and the inner-product random variable should be introduced once in §2 and used consistently thereafter.
- [§6] A short table summarizing the detection rates across the parametric models would improve readability of the comparison with prior Sobolev and extreme-value tests.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address the two major comments point by point below. Revisions will be made to strengthen the technical justifications where indicated.
read point-by-point responses
-
Referee: [§4.2] §4.2, the Edgeworth expansion for the normalized test statistic under local alternatives: the regularity conditions stated (finite moments and characteristic-function smoothness) do not obviously extend to the singular parametric models in which the induced law on inner products is discrete or lower-dimensional; this risks invalidating the remainder control used to obtain the precise detection boundaries.
Authors: We thank the referee for highlighting this point. The pairwise inner-product random variable is always supported on the compact interval [-1,1], which immediately guarantees that all moments exist and are finite, independently of the underlying spherical model. For the characteristic-function smoothness requirement, we observe that the high-dimensional parametric families considered in the paper (including those singular with respect to the uniform measure) induce absolutely continuous laws on the inner-product interval whose densities are infinitely differentiable on the interior. This smoothness arises from the rotational invariance combined with the parametric structure, which effectively convolves to produce smooth marginals even when the original measure on the sphere lacks a density. In the revised manuscript we will insert a new lemma in §4.2 that verifies the required characteristic-function conditions model by model, thereby justifying the Edgeworth remainder control and the resulting detection boundaries. revision: yes
-
Referee: [§5.1] §5.1, Theorem 5.3: the claim that the distance captures the minimax rate simultaneously for all listed models relies on reducing the lower-bound construction to the inner-product marginal; the reduction step needs an explicit least-favorable sequence for each singular model to confirm that no rate gap is introduced by the characterization.
Authors: We agree that an explicit verification of the reduction step improves clarity. The lower-bound argument is formulated directly on the space of probability measures on the inner-product interval [-1,1], and each parametric model on the sphere maps to a subclass of such measures. To confirm that the reduction preserves the exact minimax rate, we will add, in the revised version of §5.1, explicit least-favorable sequences for each singular model. These sequences are obtained by selecting parameter paths whose induced inner-product laws lie at the detection boundary; the resulting lower bound matches the upper bound delivered by the proposed test, showing that the inner-product characterization introduces no rate gap. revision: yes
Circularity Check
Derivation chain is self-contained with independent characterization and analysis
full rationale
The paper introduces a probabilistic characterization of uniformity on the hypersphere via the law of pairwise inner products, explicitly extending cited external results from Cuesta et al. (2007, 2009) rather than self-citations. This characterization is used to define an Ingster-type distance, after which the paper derives its asymptotic properties via Edgeworth expansions and proves (as a separate result) that the distance attains minimax rates across the listed parametric models. No step reduces the central optimality claim to a fitted parameter, a self-referential definition, or a load-bearing self-citation chain; the cited projection ideas are independent prior work, and the Edgeworth analysis is presented as a technical tool applied to the new distance rather than presupposing the target rates. The derivation therefore stands on its own mathematical content against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The distribution of pairwise inner products provides a probabilistic characterization of the uniform distribution on the hypersphere that extends prior projection ideas in a data-driven manner.
- domain assumption Edgeworth-type expansions can be applied systematically to analyze the asymptotic behavior of the proposed distance.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We propose a new probabilistic characterization of the uniform distribution on hyperspheres in terms of its inner product... This characterization naturally leads to an Ingster-type distance... analyzed systematically via Edgeworth-type expansions.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the proposed test is universally consistent in fixed dimensions, minimax-optimal over a variety of high-dimensional parametric models, even in the models where densities with respect to the uniform law do not exist.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Rayleigh test in [14] and [44] . This test can be formulated in terms of a U-statistic of the data points with the inner product kernel, i.e. Rn := √2p n X 1≤i<j≤n X ⊤ i Xj. (2)
-
[2]
Bingham test in [15, 51] and [44] . This test is also based on a U-statistic of the data points, but with a quadratic inner product kernel, i.e. Bn := p n X 1≤i<j≤n h X ⊤ i Xj 2 − 1 p i . (3)
-
[3]
This test is based on the smallest angle, i.e
Packing test in [8]. This test is based on the smallest angle, i.e. Pn := p · max 1≤i<j≤n X ⊤ i Xj 2 − 4 logn + log logn. (4) 3 The asymptotic distributions of these test statistics, as well as their non-null behaviors have been studied rigorously over the last few years; see, for example, [14, 15, 44, 45, 35]. It is known that the Rayleigh test Rn and th...
-
[4]
is as follows. Let X1, X2 be random variables on Sp−1 for some p ≥ 1 and U ∼ Unif(Sp−1), then under some regularity conditions, X ⊤ 1 U d = X ⊤ 2 U ⇔ X1 d = X2. (7) Broadly speaking, the characterization (7) relies on projections onto independent, uniformly dis- tributed directions. To construct testing procedures using (7), one often needs to sample the ...
-
[5]
Reduction in computational cost: Unlike projection-based methods, our test avoids sampling random directions or integrating over all possible directions, which often requires complex procedures to approximate the critical values. Theorem 1 demonstrates that when the dimension is large, the tail probabilities of our test statistic are much simpler to appro...
-
[6]
Flexibility in the high-dimensional settings: While existing projection-based tests are valid only in fixed-dimensional scenarios, our test extends seamlessly to high-dimensional settings, including cases where p is large and n is small. Extending projection-based tests to such settings is highly non-trivial, as it requires understanding how the eigenvalu...
-
[7]
It would be of significant interest to investigate other types of distances, such as the weighted- L2 distance in Cramer-von Mises statistic and Anderson-Darling statistics. The limiting distribution under H0 can be similarly derived as was done under the Kolomogrov distance in Theorem 1, and optimal properties can be established by choosing appropriate w...
-
[8]
Studying the optimal properties of the test ϕn is an interesting topic for future research. It is reasonable to conjecture that our proposed test ϕn is optimal among the class of rotationally invariant tests and over the class of alternatives which satisfy the condition (14). This would likely require some new tools, especially an equivalent and more conv...
-
[9]
It is of independent interest to extend Proposition 1 to other type of spherical distributions. We conjecture that the conclusion of Proposition 1 is still true for two arbitrary Borel probability measures, up to an orthogonal transformation. 5 Proofs 5.1 Proof of Proposition 1 Before presenting the proof, we first state a version of Lebesgue’s differenti...
-
[10]
Then, we have EH 2 n(X1, X2) → 0 as n → ∞. Proof of Lemma 4 . Let Y be drawn from the uniform distribution on Sp−1 independently from X1 and X2. Thanks to Lemma 3, we can write EH 2 n(X1, X2) = EE2 h hn X T 1 Y · hn X T 2 Y X1, X2 i = EE2 hn ξ1 ∥ξ∥ · hn X T 1 X2 · ξ1 ∥ξ∥ + q 1 − (X T 1 X2)2 · ξ2 ∥ξ∥ X1, X2 , where ξ = (ξ1, ξ2, . . . , ξp)⊤ is a vector con...
-
[11]
The total contribution of such terms are of order O(n−2) · |s − t|
If |I| = 2, then all the terms have the form E η(n) a1a2 4 . The total contribution of such terms are of order O(n−2) · |s − t|
-
[12]
If |I| = 3, then up to a permutation, we can bound the expectation of the three possibilities: E h η(n) 12 2 · η(n) 23 · η(n) 31 i ≤ 4|s − t|2, E h η(n) 12 3 · η(n) 23 i = 0, E h η(n) 12 2 · η(n) 23 2i = h (s − t)(1 − s + t) i2 ≤ (s − t)2. In this case, the total contribution of all terms of the above three types are of order O (n−1 · |s − t|2). 24
-
[13]
If |I| = 4, then there are two scenarios: (i) each index shows up exactly twice and (ii) there exists at least one index that shows up exactly one time. For scenario (i), up to a permutation, all the terms have the form η(n) 12 · η(n) 23 · η(n) 34 · η(n) 41 , which we can bound its expectation as E h η(n) 12 · η(n) 23 · η(n) 34 · η(n) 41 i ≤ 4E η(n) 12 · ...
-
[14]
Moreover, the terms in this case have to be the product of at least three η(n)’s
If 5 ≤ | I| ≤ 8, then there exists at least 2 indices that appear exactly one time. Moreover, the terms in this case have to be the product of at least three η(n)’s. Let a and b be these two indices. If they belong to two different η(n)’s, then at least one of them has power 1 and the expectation vanishes by conditioning on all Xi’s except that index. If ...
-
[15]
B. Ajne. A simple test for uniformity of a circular distribution. Biometrika 55(2), 343–354. , 1968
work page 1968
-
[16]
S. Balakrishnan and L. Wasserman. Hypothesis testing for high-dimensional multinomials: A selective review. Annals of Applied Statistics , 2018
work page 2018
-
[17]
A. Banerjee, I. Dhillon, J. Ghosh, and S. Sra. Generative model-based clustering of direc- tional data. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining , pages 19–28, 2003
work page 2003
-
[18]
A. Banerjee and J. Ghosh. Frequency sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres. IEEE T. Neural Network. , 15:702–719, 2004
work page 2004
-
[19]
B. Bhattacharya and R. Mukherjee. Sparse uniformity testing. IEEE Transactions on Infor- mation Theory, 2024
work page 2024
-
[20]
P. Billingsley. Convergence of Probability Measures, 2nd edition . Wiley Series in Probability and Statistics., 1999
work page 1999
- [21]
-
[22]
T. Cai, J. Fan, and T. Jiang. Distributions of angles in random packing on spheres. J. Mach. Learn. Res., 14:1837–1864, 2013
work page 2013
- [23]
-
[24]
T. T. Cai and Z. Ma. Optimal hypothesis testing for high-dimensional covariance matrices. Bernoulli 19(5B), 2359–2388. , 2013
work page 2013
-
[25]
Y-B. Chan and P. Hall. Robust nearest-neighbor methods for classifying high-dimensional data. Ann. Statist. 37(6A): 3186-3203. , 37(6A):3186–3203, 2009. 26
work page 2009
-
[26]
J. Cuesta-Albertos, A. Cuevas, and R. Fraiman. On projection-based tests for directional and compositional data. Statistics and Computing , 19:367–380, 2009
work page 2009
-
[27]
J. Cuesta-Albertos, R. Fraiman, and T. Ransford. A sharp form of the cram´ er–wold theorem. Journal of Theoretical Probability, 20(2):201–209, 2007
work page 2007
-
[28]
C. Cutting, D. Paindaveine, and T. Verdebout. Testing uniformity on high-dimensional spheres against contiguous rotationally symmetric alternatives. Ann. Stat. , 45:1024–1058, 2017
work page 2017
-
[29]
C. Cutting, D. Paindaveine, and T. Verdebout. Testing uniformity on high-dimensional spheres: The non-null behaviour of the bingham test. Annales de I’I.H.P. Probabilit´ es et statistiques, 58:567–602, 2022
work page 2022
-
[30]
H. Dehling, Mikosch T., and M. Sorensen. Empirical Processes Techniques for Dependent Data. Birkhaause, 2002
work page 2002
-
[31]
P. Diaconis and D. Freedman. A dozen de finetti-style results in search of a theory. Annales de I’I.H.P. Probabilit´ es et statistiques 23, 397-423., 1987
work page 1987
-
[32]
I. L. Dryden. Statistical analysis on high-dimensional spheres and shape spaces. Ann. Stat., 33:1643–1665, 2005
work page 2005
-
[33]
J. Escanciano. A consistent diagnostic test for regression models using projections. Econo- metric Theory, 22(6):1030–1051, 2006
work page 2006
-
[34]
H. Federer. Geometric Measure Theory. Springer-Verlag, 1969
work page 1969
-
[35]
A. Fern´ andez-de Marcos and E. Garc´ ıa-Portugu´ es. On new omnibus tests of uniformity on the hypersphere. Test, 32(4):1508–1529, 2023
work page 2023
-
[36]
R. Fraiman, L. Moreno, and T. Ransford. A cram´ er–wold theorem for elliptical distributions. Journal of Multivariate Analysis , 196:105176, 2023
work page 2023
-
[37]
R. Fraiman, L. Moreno, and T. Ransford. A quantitative heppes theorem and multivariate bernoulli distributions. Journal of the Royal Statistical Society Series B: Statistical Method- ology, 85(2):293–314, 2023
work page 2023
-
[38]
R. Fraiman, L. Moreno, and T. Ransford. Application of the cram´ er–wold theorem to testing for invariance under group actions. TEST, 33(2):379–399, 2024
work page 2024
-
[39]
E. Garc´ ıa-Portugu´ es, P. Navarro-Esteban, and J. Cuesta-Albertos. A cram´ er–von mises test of uniformity on the hypersphere. In Statistical Learning and Modeling in Data Analysis: Methods and Applications 12 , pages 107–116. Springer, 2021
work page 2021
-
[40]
E. Garc´ ıa-Portugu´ es, P. Navarro-Esteban, and J. Cuesta-Albertos. On a projection-based class of uniformity tests on the hypersphere. Bernoulli, 29(1):181–204, 2023
work page 2023
-
[41]
E. Gin´ e. Invariant tests for uniformity on compact riemannian manifolds based on sobolev norms. Ann. Stat. 3(6): 1243-1266. , 1975. 27
work page 1975
-
[42]
P. Hall and C. C. Heyde. Martingale Limit Theory and Its Application . Academic Press, New York., 1980
work page 1980
-
[43]
T. Jiang. The asymptotic distribution of the largest entry of sample correlation matrix. Annals of Applied Probability. 14(2), 865–880. , 2005
work page 2005
-
[44]
T. Jiang. A variance formula related to quantum conductance. Physics Letters A 373, 2117- 2121., 2009
work page 2009
-
[45]
J. Jost, H. V. Lˆ e, and T. D. Tran. Probabilistic morphisms and bayesian nonparametrics. The European Physical Journal Plus , 136(4):1–29, 2021
work page 2021
-
[46]
J. Juan and F. J. Prieto. Using angles to identify concentrated multivariate outliers. Tech- nometrics, 43:311–322, 2001
work page 2001
-
[47]
P. E. Jupp. Data-driven sobolev tests of uniformity on compact riemannian manifolds. Ann. Statist. 36(3):1246–1260., 2008
work page 2008
-
[48]
N. H. Kuiper. Tests concerning random points on a circle. Proc. K. Ned. Akad. Wet. A, 63, 38–47., 1960
work page 1960
-
[49]
C. Ley, D. Paindaveine, and T. Verdebout. High-dimensional tests for spherical location and spiked covariance. Journal of Multivariate Analysis 139, 79-91. , 2015
work page 2015
- [50]
-
[51]
R. Lin, W. Liu, Z. Liu, C. Feng, Z. Yu, Li Rehg, J., and L. Song. Regularizing neural networks via minimizing hyperspherical energy. In CVF Conference on Computer Vision and Pattern Recognition, CVPR, pages 13–19, 2020
work page 2020
-
[52]
W. Liu, R. Lin, Z. Liu, L. Liu, Z. Yu, B. Dai, and L. Song. Learning towards minimum hyperspherical energy. Advances in neural information processing systems , 31, 2018
work page 2018
-
[53]
M. Mahoney and C. Martin. Traditional and heavy tailed self regularization in neural network models. In International Conference on Machine Learning , pages 4284–4293. PMLR, 2019
work page 2019
-
[54]
K. V. Mardia and P. E. Jupp. Directional Statistics. John Wiley & Sons, Chichester., 2000
work page 2000
-
[55]
G. Marsaglia, W. W. Tsang, and J. Wang. Evaluating kolmogorov’s distribution. Journal of Statistical Software 8(18), 1–4. , 2003
work page 2003
-
[56]
E. Meckes. The Random Matrix Theory of the Classical Compact Groups . Cambridge Uni- versity Press., 2019
work page 2019
-
[57]
A. Onatski, M. Moreira, and M. Hallin. Asymptotic power of sphericity tests for high- dimensional data. Ann. Stat. 41, 1206–1231. , 2013
work page 2013
-
[58]
D. Paindaveine and T. Verdebout. On high-dimensional sign tests. Bernoulli, 22:1745–1769, 2016
work page 2016
-
[59]
D. Paindaveine and T. Verdebout. Detecting the direction of a signal on high-dimensional spheres: non-null and le cam optimality results. Probability Theory and Related Fields 176, 1165-1216., 2020. 28
work page 2020
-
[60]
A. Pewsey and E. Garc´ ıa-Portugu´ es. Recent advances in directional statistics.Test, 30(1):1– 58, 2021
work page 2021
-
[61]
E. G. Portugu´ es and T. Verdebout. An overview of uniformity tests on the hypersphere. arXiv:1804.00286., 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[62]
L. Wang, B. Peng, and R. Li. A high-dimensional nonparametric multivariate test for mean vector. Journal of the American Statistical Association , 110:1658–1669, 2015
work page 2015
-
[63]
G. S. Watson. Goodness-of-fit tests on a circle. Biometrika 48(1 and 2), 109–114. , 1961
work page 1961
-
[64]
B. Xie, Y. Liang, and L. Song. Diverse neural network learns true target functions. In Artificial Intelligence and Statistics , pages 1216–1224. PMLR, 2017
work page 2017
-
[65]
C. Zou, L. Peng, L. Feng, and Z. Wang. Multivariate-sign-based high-dimensional tests for sphericity. Biometrika, 101:229–236, 2014. Appendices A Simulation results and background on spherical distri- butions A.1 Common classes of high-dimensional alternatives As noted in the survey paper [47], the uniformity testing problem (1) is of nonparametric nature...
work page 2014
-
[66]
Chi-squared distribution. Consider χ2(1) and χ2(2). We also normalize the distribution so that it is centered and have unit variance, i.e. subtract the generated observation by k and then divide by √ 2k
-
[67]
In this setting, we choose the standard Cauchy distribution
Cauchy distribution. In this setting, we choose the standard Cauchy distribution. Note that this is a heavy-tailed setting, which is not covered in our theory. Interestingly, the proposed test works quite well
-
[68]
In this setting, we choose the degree of freedom to be 1.5
t-distribution. In this setting, we choose the degree of freedom to be 1.5. With this particular choice, the distribution has mean 0 and infinite variance. We can see that, the tail in this setting is lighter than that of the previous setting, but is still in the heavy-tailed regime. The results are presented in Table 4. Our proposed test ϕn outperforms a...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.