Maximum-of-Differences Test for Comparing Multivariate K-Sample Distributions
Pith reviewed 2026-05-10 16:54 UTC · model grok-4.3
The pith
A new maximum-of-differences test compares K multivariate samples by maximizing standardized gaps between within-sample and between-sample connection probabilities from pairwise distances.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors define the MOD statistic as the maximum, over all observations, of the standardized squared difference between the within-sample connection probability and the between-sample connection probability, where two observations are connected if their Euclidean distance is less than a pre-specified threshold. They introduce the covariance-adjusted version CA-MOD whose null limiting distribution is the Type I extreme value distribution under suitable regularity conditions, derive the asymptotic behavior of both statistics under the null and under fixed alternatives, and show that the tests remain applicable to multivariate linear models by replacing raw observations with residuals.
What carries the argument
The MOD statistic, the maximum over observations of the standardized squared difference between within-sample and between-sample connection probabilities induced by a distance-threshold graph on the pooled sample.
If this is right
- The test applies directly to any number K of multivariate samples without requiring equal sample sizes.
- The same construction yields a test for equality of conditional distributions in a multivariate regression model after replacing observations by residuals.
- Under the null, the CA-MOD statistic converges in distribution to the Type I extreme value law, permitting asymptotic p-value calculation.
- The test is consistent against alternatives in which at least one observation exhibits a local discrepancy in within-sample versus between-sample connection rates.
Where Pith is reading between the lines
- Choosing the threshold adaptively from the data rather than fixing it in advance could reduce sensitivity to density variations across the support.
- The method supplies a graph-based alternative to integral probability metrics for multi-sample testing and may be combined with existing high-dimensional distance measures.
- Because the statistic isolates the single most discrepant observation, it can serve as a diagnostic pointer to which sample or region drives rejection.
Load-bearing premise
The procedure depends on a fixed pre-specified distance threshold for declaring connections together with regularity conditions that guarantee the covariance-adjusted statistic converges to the Type I extreme value distribution.
What would settle it
Repeated simulations drawn from identical multivariate distributions in which the empirical distribution of the CA-MOD statistic deviates systematically from the Type I extreme value limit as sample size grows would refute the claimed convergence.
read the original abstract
Comparing $K$-sample distributions is a fundamental problem in data science that arises in a wide variety of fields and applications. In this article, we introduce a maximum-of-differences approach to make such comparisons. Specifically, we first calculate the pairwise distances from the pooled observations of the $K$ samples. We then define the two observations as connected if their distance is less than a pre-specified threshold value. For each observation, we next calculate the ``within" and the ``between" probabilities associated with these two types of connections for the given observation, i.e., with other observations within the same sample and between the given observation and the observations in other samples. Subsequently, we propose a maximum-of-differences (MOD) test that finds the maximum value among the standardized squared differences between the ``within" and the ``between" probabilities of all observations. Accordingly, the proposed test is not only applicable to multivariate data with $K$ samples, but can also be extended to multivariate regression models. Furthermore, we obtain the covariance-adjusted (CA) version of the MOD (CA-MOD) test, which converges to the Type I extreme value distribution under some conditions. Moreover, we demonstrate the asymptotic properties of the two tests under both the null and alternative hypotheses. The performance and usefulness of the tests are illustrated via simulation studies and real examples.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a maximum-of-differences (MOD) test for comparing K-sample multivariate distributions. Observations are pooled, pairwise distances computed, and pairs declared connected if distance falls below a pre-specified threshold. Per-observation within-sample and between-sample connection probabilities are formed, their squared differences standardized, and the maximum taken as the test statistic. A covariance-adjusted variant (CA-MOD) is introduced that is claimed to converge to the Type I extreme-value distribution under unspecified conditions. Asymptotic properties are derived under both null and alternative hypotheses, the procedure is extended to multivariate regression, and performance is illustrated through simulations and real-data examples.
Significance. If the regularity conditions can be made explicit and a practical rule for threshold selection supplied, the MOD/CA-MOD framework would supply a new nonparametric, graph-based approach to multi-sample multivariate testing that avoids strong parametric assumptions. The claimed extreme-value limit and the extension to regression settings would constitute a genuine methodological contribution provided the supporting derivations are complete and the finite-sample behavior is convincingly documented.
major comments (3)
- [Method description / abstract] The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.
- [CA-MOD construction] For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).
- [Asymptotic results] The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.
minor comments (2)
- [Abstract] The abstract states that the method “can also be extended to multivariate regression models” but supplies no concrete description of the extension; the main text should contain at least a brief outline of the necessary modifications.
- [Simulation section] Simulation results are mentioned but no tables or figures reporting empirical Type I error rates or power under varying dimensions and sample sizes are referenced in the abstract; these should be added with explicit numerical summaries.
Simulated Author's Rebuttal
We are grateful to the referee for the constructive feedback on our manuscript. We have carefully considered each of the major comments and provide point-by-point responses below. We plan to incorporate several revisions to address the concerns raised.
read point-by-point responses
-
Referee: The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.
Authors: We agree that providing guidance on threshold selection is important for practical use of the MOD test. Although the current manuscript treats the threshold as a pre-specified parameter (similar to bandwidth in kernel methods), we will revise the manuscript to include a practical data-driven rule, such as selecting the threshold as the median of pairwise distances within a subsample or using a cross-validation approach to optimize the test power. We will also add a discussion on how the choice affects the graph sparsity and include sensitivity analyses in the simulations. revision: yes
-
Referee: For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).
Authors: The covariance matrix estimate in CA-MOD is computed from the pooled sample under the null hypothesis, and our derivation shows that the estimation error is negligible in the limit, preserving the extreme-value convergence. However, we acknowledge that this was not explicitly stated. In the revision, we will add a detailed explanation of how the plug-in estimation is accounted for in the asymptotic analysis, including why it does not introduce invalidating dependence. revision: yes
-
Referee: The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.
Authors: We appreciate this observation. The phrase 'under some conditions' was intended to refer to standard assumptions such as finite moments of the distance indicators, weak dependence between observations, and appropriate scaling of the threshold with sample size. In the revised manuscript, we will explicitly list these regularity conditions in a dedicated subsection and provide verification for the multivariate case under the null hypothesis. revision: yes
Circularity Check
No significant circularity detected in the derivation chain.
full rationale
The MOD statistic is constructed directly by pooling observations, computing pairwise distances against a fixed pre-specified threshold, forming per-observation within- and between-connection probabilities, standardizing their squared differences, and taking the maximum. The CA-MOD variant applies a covariance adjustment whose asymptotic convergence to the Type I extreme-value limit is asserted under regularity conditions on the threshold and dependence structure. None of these steps reduces the claimed limit or test by construction to a fitted parameter renamed as a prediction, a self-citation chain, or an ansatz smuggled from prior work; the threshold and covariance estimator are treated as inputs whose consistency is assumed for the mathematical derivation to hold. The procedure therefore remains self-contained as a definition plus conditional asymptotics rather than a tautological loop.
Axiom & Free-Parameter Ledger
free parameters (1)
- threshold value
Reference graph
Works this paper leans on
-
[1]
(1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128
Andrews, D. (1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128
work page 1997
-
[2]
Bai, Z. D. and Saranadasa, H. (1996), Effect of high dimension: By an example of a two sample problem , Statistica Sinica, 6, 311-329
work page 1996
-
[3]
Bickel, P. J. (1969), A distribution free version of the smirnov two sample test in the p-variate case , Annals of Mathematical Statistics, 40, 1--23
work page 1969
-
[4]
Cai, T., Liu, W. and Xia, Y. (2014) Two-sample test of high dimensional means under dependence , Journal of the Royal Statistical Society Series B (Statistical Methodology), 76, 349-372
work page 2014
-
[5]
Carpenter, J., Lu, F. and Whitelaw, R. F. (2021) The real value of China's stock market , Journal of Financial Economics, 139, 679--696
work page 2021
- [6]
-
[7]
Chen, H. and Friedman, J. H. (2017), A new graph-based two-sample test for multivariate and object data , Journal of the American Statistical Association, 112, 397-409
work page 2017
-
[8]
Chen, H. and Zhang, N. (2013), Graph-based tests for two-sample comparisons of categorical data , Statistica Sinica, 23, 1479-1503
work page 2013
-
[9]
Chernozhukov, V., Denis, C. and Kengo, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , Annals of Statistics, 41, 2786-2819
work page 2013
-
[10]
Chong, T. L., Lam, T. H. and Yan, K. M. (2012). Is the Chinese stock market really inefficient? , China Economic Review, 23, 122-137
work page 2012
-
[11]
Dicle, M. and Levendis, J. (2014), Day-of-the-week effect revisited: International evidence , Journal of Economics and Finance, 38, 407-437
work page 2014
-
[12]
Eagle, N., Pentland, A. and Lazer, D. (2009), Inferring friendship network structure by using mobile phone data , Proceedings of the National Academy of Sciences of the USA, 106, 15274-15278
work page 2009
-
[13]
Fama, E. and French, K. (1993). Common risk factors in the returns on stocks and bonds, Journal of Financial Economics 33, 3-56
work page 1993
- [14]
-
[15]
Friedman, J. and Rafsky, L. (1979), Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests , Annals of Statistics, 7, 697-717
work page 1979
-
[16]
Gibbons, J. D. and Chakraborti, S. (2011), Nonparametric Statistical Inference, Springer
work page 2011
-
[17]
Gibbons, M. and Hess, P. (1981), Day of the week effects and asset returns , Journal of Business, 54, 579-596
work page 1981
- [18]
-
[19]
Heller, R., Heller, Y., Kaufman, S., Brill, B. and Gorfine, M. (2016), Consistent distribution-free K -sample and independence tests for univariate random variables , Journal of Machine Learning Research, 17, 1-54
work page 2016
-
[20]
Jiang, B., Chao, Y. and Liu, J. (2015), Non-parametric K -sample tests via dynamic slicing , Journal of the American Statistical Association, 110, 642-653
work page 2015
-
[21]
Kohers, G. and Pandey, V. (2004), The disappearing day-of-the-week effect in the world's largest equity markets , Applied Economics Letters, 11, 167-171
work page 2004
-
[22]
Lehmann, E. L. (2004), Elements of Large-Sample Theory , Springer
work page 2004
-
[23]
Li, J. and Chen, S. (2012), Two sample tests for high-dimensional covariance matrices , Annals of Statistics, 40, 908-940
work page 2012
-
[24]
Li, Q., Maasoumi, E. and Racine, J. S. (2009). A nonparametric test for equality of distributions with mixed categorical and continuous data, Journal of Econometrics, 148, 186-200
work page 2009
-
[25]
Li, R., Zhong, W. and Zhu, L. (2012), Feature screening via distance correlation learning , Journal of the American Statistical Association, 107, 1129-1139
work page 2012
-
[26]
Mukherjee, S., Agarwal, D., Zhang, N. R. and Bhattacharya, B. (2022), Distribution-free multisample tests based on optimal matchings with applications to single Cell Genomics , Journal of the American Statistical Association, 117, 627-638
work page 2022
-
[27]
Mukhopadhyay, S. and Wang, K. (2020), A nonparametric approach to high-dimensional k -sample comparison problems , Biometrika, 107, 555-572
work page 2020
-
[28]
Oja, H. and Randles, R. H. (2004), Multivariate nonparametric tests , Statistical Science, 19, 598-605
work page 2004
-
[29]
(2003), Mathematical Statistics, Second Edition, Springer-Verlag New York
Shao, J. (2003), Mathematical Statistics, Second Edition, Springer-Verlag New York
work page 2003
-
[30]
Sharpe, W. (1964), Capital asset prices: A theory of market equilibrium under conditions of risk , Journal of Finance, 19, 425-444
work page 1964
-
[31]
Wynne, G. and Duncan, A. (2022), A kernel two-sample test for functional data , Journal of Machine Learning Research, 23, 1-51
work page 2022
-
[32]
Xue, K. and Yao, F. (2020), Distribution and correlation-free two-sample test of high-dimensional means , Annals of Statistics, 48, 1304-1328
work page 2020
-
[33]
Zhang, J. T., Guo, J. and Zhou, B. (2024), Testing equality of several distributions in separable metric spaces: A maximum mean discrepancy based approach , Journal of Econometrics, 239, 105286
work page 2024
-
[34]
Zhong, P. S., Chen, S. X. and Xu, M. (2013), Tests alternative to higher criticism for high-dimensional means under sparsity and column-wise dependence , Annals of Statistics, 41, 2820-2851. description document
work page 2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.