Logarithmic energy distances and Gini covariance for Hilbert-valued random elements
Pith reviewed 2026-06-26 23:13 UTC · model grok-4.3
The pith
As alpha approaches zero, normalized energy distances in Hilbert spaces converge to a logarithmic version using log of the norm that still characterizes equality of distributions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
After suitable normalization, the energy distance with kernel ||x-y||^alpha converges to the logarithmic energy distance with kernel log||x-y|| as alpha ↓ 0. This logarithmic version retains the characterization that the distance is zero if and only if the two random elements have the same distribution in a real separable Hilbert space. It admits a representation in terms of Gaussian-kernel maximum mean discrepancies. Motivated by this, a logarithmic Gini covariance is defined for the k-sample problem, with representations in terms of pairwise distances, a characterization theorem, and asymptotic theory for the empirical version.
What carries the argument
The logarithmic energy distance defined via the kernel (x,y) mapsto log||x-y||, which arises as the normalized limit of generalized energy distances and supports the representation via Gaussian-kernel MMDs.
If this is right
- The logarithmic energy distance characterizes equality of distributions for Hilbert-valued random elements.
- A logarithmic Gini covariance statistic applies to testing equality of distributions in the k-sample problem.
- Asymptotic distributions under the null and alternatives are available for the empirical logarithmic Gini covariance.
- Permutation-based procedures implement the test based on the logarithmic Gini covariance.
Where Pith is reading between the lines
- The boundary case may strengthen links between energy statistics and kernel methods for high-dimensional or functional data.
- The logarithmic form could inspire similar limit investigations for other powered kernels or in non-Hilbert spaces.
- Applications in high-dimensional inference may benefit when standard distances suffer from concentration effects.
Load-bearing premise
The limit of the suitably normalized energy distance exists as alpha approaches zero from above in a real separable Hilbert space.
What would settle it
Two distinct distributions on a separable Hilbert space whose logarithmic energy distance equals zero would disprove the retained characterization property.
Figures
read the original abstract
For $\alpha\in(0,2)$, the generalized energy distance and the Gini covariance statistic are based on kernels of the form $(x,y)\mapsto \|x-y\|^\alpha$, where $\|\cdot\|$ denotes the norm in a real separable Hilbert space. This paper investigates the boundary regime $\alpha\downarrow 0$. After suitable normalization, the corresponding energy distance converges to a logarithmic energy distance involving the kernel $(x,y)\mapsto\log\|x-y\|$. We establish that the resulting logarithmic energy distance retains the fundamental characterization property of ordinary energy distances in separable Hilbert spaces and derive a representation in terms of Gaussian-kernel maximum mean discrepancies. Motivated by this representation, we introduce a logarithmic Gini covariance for the $k$-sample problem and investigate its structural and asymptotic properties. In particular, we derive a representation in terms of pairwise logarithmic energy distances, establish a characterization theorem for equality of distributions, develop asymptotic null and alternative theory for the corresponding empirical statistic, and discuss permutation-based implementation. The logarithmic framework reveals a new boundary phenomenon within the family of energy-type statistics and provides connections with kernel methods, functional data analysis, and high-dimensional inference.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies the boundary regime α↓0 of the generalized energy distance and Gini covariance based on kernels ||x−y||^α (α∈(0,2)) for random elements in separable Hilbert spaces. After suitable α-dependent normalization it claims convergence to a logarithmic energy distance with kernel log||x−y||, proves that this limit retains the characterization property of ordinary energy distances, derives an MMD representation with Gaussian kernels, and introduces a logarithmic Gini covariance for the k-sample problem together with its structural properties, asymptotic theory, and permutation implementation.
Significance. If the normalized limit and characterization theorem are rigorously established, the work supplies a new boundary case linking energy distances to kernel methods and MMD, with potential utility in functional data analysis and high-dimensional inference. The explicit MMD representation and the permutation-based implementation are concrete strengths that would make the logarithmic Gini statistic immediately usable.
major comments (2)
- [normalization step / characterization theorem] The central claim that the normalized α-energy distance converges to the logarithmic version (abstract and the derivation leading to Eq. (log-energy)) requires a justification that the limit and expectation may be interchanged for arbitrary distributions on the Hilbert space. In infinite dimensions the family {||X−Y||^α}α∈(0,ε) need not admit a uniform integrable dominant, so the paper must supply either an explicit dominating function, a monotone-convergence argument, or a truncation-plus-remainder estimate that works uniformly over the class of distributions for which the α-energy distance is defined.
- [MMD representation] The representation of the logarithmic energy distance in terms of Gaussian-kernel MMD (the claim following the limit) is load-bearing for the subsequent Gini-covariance construction; the manuscript should state the precise conditions on the Gaussian bandwidth under which the representation holds and verify that these conditions are compatible with the Hilbert-space setting used for the characterization theorem.
minor comments (2)
- Notation for the normalized logarithmic distance should be introduced once and used consistently; currently the abstract and the later Gini section appear to employ slightly different scaling constants.
- The asymptotic null and alternative theory for the empirical logarithmic Gini statistic would benefit from an explicit statement of the moment conditions required on the underlying random elements.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. The two major comments identify technical points that require clarification or additional justification. We respond to each below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [normalization step / characterization theorem] The central claim that the normalized α-energy distance converges to the logarithmic version (abstract and the derivation leading to Eq. (log-energy)) requires a justification that the limit and expectation may be interchanged for arbitrary distributions on the Hilbert space. In infinite dimensions the family {||X−Y||^α}α∈(0,ε) need not admit a uniform integrable dominant, so the paper must supply either an explicit dominating function, a monotone-convergence argument, or a truncation-plus-remainder estimate that works uniformly over the class of distributions for which the α-energy distance is defined.
Authors: We agree that an explicit justification for interchanging the limit and the expectation is required in the infinite-dimensional setting. In the revised version we will insert a dedicated lemma that supplies a truncation-plus-remainder argument. The argument proceeds by truncating the norm at a large but finite level M, applying the dominated-convergence theorem on the truncated part (where the integrand is bounded), and controlling the remainder uniformly over all distributions that possess finite α-energy distance by using the monotonicity of t ↦ t^α for α ∈ (0,2) together with the triangle inequality in the Hilbert norm. This establishes the desired interchange without requiring a single integrable dominant for the whole family. revision: yes
-
Referee: [MMD representation] The representation of the logarithmic energy distance in terms of Gaussian-kernel MMD (the claim following the limit) is load-bearing for the subsequent Gini-covariance construction; the manuscript should state the precise conditions on the Gaussian bandwidth under which the representation holds and verify that these conditions are compatible with the Hilbert-space setting used for the characterization theorem.
Authors: We will add an explicit statement that the MMD representation holds for every bandwidth σ > 0. Because the underlying space is a separable Hilbert space, the Gaussian kernel exp(−‖x−y‖²/(2σ²)) is positive definite and the associated RKHS is well-defined. The proof of the representation relies only on the Fourier transform of the Gaussian and on the fact that the logarithmic kernel arises as the α → 0 limit; these steps remain valid for any σ > 0 and impose no further restrictions beyond those already used for the characterization theorem. A short remark will be inserted to confirm this compatibility. revision: yes
Circularity Check
No significant circularity; derivations are self-contained
full rationale
The paper defines the logarithmic energy distance explicitly as the normalized limit of the α-energy distance (α↓0) with kernel log‖x−y‖, then separately proves that this object retains the characterization property for equality of distributions and admits an MMD representation with Gaussian kernels. These steps are carried out via direct limiting arguments and kernel identities in separable Hilbert spaces; no load-bearing step reduces by construction to a fitted parameter, a self-referential definition, or a self-citation chain whose validity depends on the present work. The derivation chain is therefore independent of its target conclusions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The underlying space is a real separable Hilbert space.
Reference graph
Works this paper leans on
-
[1]
S. Aeberhard and M. Forina (1992). Wine [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5PC7J. 17
-
[2]
Baringhaus and C
L. Baringhaus and C. Franz,On a new multivariate two-sample test, J. Multiv. Anal.88(2004), 190–206
2004
-
[3]
Baringhaus and C
L. Baringhaus and C. Franz,Rigid motion invariant two-sample tests. Statist. Sinica20(2010), 1333–1361
2010
-
[4]
X. Dang, D. Nguyen, Y. Chen and J. Zhang,A new Gini correlation between quantitative and qualitative variables. Scand. J. Stat.48(2021), 1314–1314
2021
-
[5]
Ebner and N
B. Ebner and N. Henze,Test for multivariate normality – a critical review with emphasis on weightedL 2-statistics, TEST29(2020), 845–892
2020
-
[6]
R. Fisher(1936). Iris [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C56C76
-
[7]
Hardy,Divergent Series, Oxford University Press, Oxford 1949
G.H. Hardy,Divergent Series, Oxford University Press, Oxford 1949
1949
-
[8]
Henze,Extreme smoothing and testing for multivariate normality, Statist
N. Henze,Extreme smoothing and testing for multivariate normality, Statist. & Prob. Lett.35 (1997), 203–213
1997
-
[9]
Henze,Asymptotic Stochastics
N. Henze,Asymptotic Stochastics. An introduction with a view towards statistics, Mathematics Study Resources Vol. 10, Springer, Heidelberg 2024
2024
-
[10]
Jim´ enez-Gamero and M.R
M.D. Jim´ enez-Gamero and M.R. Sillero-Denamiel,Thek-sample problem using Gini covari- ance for largek, J. Multiv. Anal.210(2025), 105463
2025
-
[11]
R: A language and environment for statistical computing
R Core Team (2022). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/
2022
-
[12]
fda: Functional Data Analysis
Ramsay J (2025). fda: Functional Data Analysis. R package version 6.3.0, https://CRAN.R- project.org/package=fda
2025
-
[13]
Rizzo and G.J
M.L. Rizzo and G.J. Sz´ ekely.DISCO analysis: A nonparametric extension of analysis of vari- ance.Ann. Appl. Stat. 4 (2) (2010) 1034–1055
2010
-
[14]
Serfling,Approximation Theorems of Mathematical Statistics,Wiley, New York 1980
R.J. Serfling,Approximation Theorems of Mathematical Statistics,Wiley, New York 1980
1980
-
[15]
Sang and X
Y. Sang and X. Dang.Asymptotic normality of Gini correlation in high dimension with appli- cations to the K-sample problem. Electron. J. Stat.17(2023) 2539–2574
2023
-
[16]
Schoenberg, Metric spaces and positive definite functions (1938).Trans
I.J. Schoenberg, Metric spaces and positive definite functions (1938).Trans. Amer. Math. Soc. 44, 522–536
1938
-
[17]
Sz´ ekely and M.L
G.J. Sz´ ekely and M.L. Rizzo.Energy statistics: A class of statistics based on distances.J. Stat. Plann. Infer.143(2013), 1249-–1272
2013
-
[18]
Zhang, X
J.T. Zhang, X. Liang, and S. Xiao.On the two-sample Behrens-Fisher problem for functional data. J. Statist. Theory Pract.4(2010), 571–587. 18
2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.