pith. machine review for the scientific record. sign in

arxiv: 2604.09376 · v1 · submitted 2026-04-10 · 📊 stat.ME

Maximum-of-Differences Test for Comparing Multivariate K-Sample Distributions

Pith reviewed 2026-05-10 16:54 UTC · model grok-4.3

classification 📊 stat.ME
keywords maximum-of-differences testK-sample comparisonmultivariate distributionsnonparametric testextreme value distributionpairwise distancesconnection probabilitiescovariance adjustment
0
0 comments X

The pith

A new maximum-of-differences test compares K multivariate samples by maximizing standardized gaps between within-sample and between-sample connection probabilities from pairwise distances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a test for whether K multivariate distributions are identical. It pools all observations, builds connections whenever pairwise distances fall below a fixed threshold, and for each point computes the probability of connections staying inside its own sample versus crossing to other samples. The test statistic is the largest standardized squared difference across all points, with a covariance-adjusted variant that has a known limiting distribution. This construction works for any dimension and K, requires no parametric assumptions on the distributions, and extends to regression settings where the goal is to compare conditional distributions.

Core claim

The authors define the MOD statistic as the maximum, over all observations, of the standardized squared difference between the within-sample connection probability and the between-sample connection probability, where two observations are connected if their Euclidean distance is less than a pre-specified threshold. They introduce the covariance-adjusted version CA-MOD whose null limiting distribution is the Type I extreme value distribution under suitable regularity conditions, derive the asymptotic behavior of both statistics under the null and under fixed alternatives, and show that the tests remain applicable to multivariate linear models by replacing raw observations with residuals.

What carries the argument

The MOD statistic, the maximum over observations of the standardized squared difference between within-sample and between-sample connection probabilities induced by a distance-threshold graph on the pooled sample.

If this is right

  • The test applies directly to any number K of multivariate samples without requiring equal sample sizes.
  • The same construction yields a test for equality of conditional distributions in a multivariate regression model after replacing observations by residuals.
  • Under the null, the CA-MOD statistic converges in distribution to the Type I extreme value law, permitting asymptotic p-value calculation.
  • The test is consistent against alternatives in which at least one observation exhibits a local discrepancy in within-sample versus between-sample connection rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Choosing the threshold adaptively from the data rather than fixing it in advance could reduce sensitivity to density variations across the support.
  • The method supplies a graph-based alternative to integral probability metrics for multi-sample testing and may be combined with existing high-dimensional distance measures.
  • Because the statistic isolates the single most discrepant observation, it can serve as a diagnostic pointer to which sample or region drives rejection.

Load-bearing premise

The procedure depends on a fixed pre-specified distance threshold for declaring connections together with regularity conditions that guarantee the covariance-adjusted statistic converges to the Type I extreme value distribution.

What would settle it

Repeated simulations drawn from identical multivariate distributions in which the empirical distribution of the CA-MOD statistic deviates systematically from the Type I extreme value limit as sample size grows would refute the claimed convergence.

read the original abstract

Comparing $K$-sample distributions is a fundamental problem in data science that arises in a wide variety of fields and applications. In this article, we introduce a maximum-of-differences approach to make such comparisons. Specifically, we first calculate the pairwise distances from the pooled observations of the $K$ samples. We then define the two observations as connected if their distance is less than a pre-specified threshold value. For each observation, we next calculate the ``within" and the ``between" probabilities associated with these two types of connections for the given observation, i.e., with other observations within the same sample and between the given observation and the observations in other samples. Subsequently, we propose a maximum-of-differences (MOD) test that finds the maximum value among the standardized squared differences between the ``within" and the ``between" probabilities of all observations. Accordingly, the proposed test is not only applicable to multivariate data with $K$ samples, but can also be extended to multivariate regression models. Furthermore, we obtain the covariance-adjusted (CA) version of the MOD (CA-MOD) test, which converges to the Type I extreme value distribution under some conditions. Moreover, we demonstrate the asymptotic properties of the two tests under both the null and alternative hypotheses. The performance and usefulness of the tests are illustrated via simulation studies and real examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a maximum-of-differences (MOD) test for comparing K-sample multivariate distributions. Observations are pooled, pairwise distances computed, and pairs declared connected if distance falls below a pre-specified threshold. Per-observation within-sample and between-sample connection probabilities are formed, their squared differences standardized, and the maximum taken as the test statistic. A covariance-adjusted variant (CA-MOD) is introduced that is claimed to converge to the Type I extreme-value distribution under unspecified conditions. Asymptotic properties are derived under both null and alternative hypotheses, the procedure is extended to multivariate regression, and performance is illustrated through simulations and real-data examples.

Significance. If the regularity conditions can be made explicit and a practical rule for threshold selection supplied, the MOD/CA-MOD framework would supply a new nonparametric, graph-based approach to multi-sample multivariate testing that avoids strong parametric assumptions. The claimed extreme-value limit and the extension to regression settings would constitute a genuine methodological contribution provided the supporting derivations are complete and the finite-sample behavior is convincingly documented.

major comments (3)
  1. [Method description / abstract] The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.
  2. [CA-MOD construction] For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).
  3. [Asymptotic results] The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.
minor comments (2)
  1. [Abstract] The abstract states that the method “can also be extended to multivariate regression models” but supplies no concrete description of the extension; the main text should contain at least a brief outline of the necessary modifications.
  2. [Simulation section] Simulation results are mentioned but no tables or figures reporting empirical Type I error rates or power under varying dimensions and sample sizes are referenced in the abstract; these should be added with explicit numerical summaries.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for the constructive feedback on our manuscript. We have carefully considered each of the major comments and provide point-by-point responses below. We plan to incorporate several revisions to address the concerns raised.

read point-by-point responses
  1. Referee: The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.

    Authors: We agree that providing guidance on threshold selection is important for practical use of the MOD test. Although the current manuscript treats the threshold as a pre-specified parameter (similar to bandwidth in kernel methods), we will revise the manuscript to include a practical data-driven rule, such as selecting the threshold as the median of pairwise distances within a subsample or using a cross-validation approach to optimize the test power. We will also add a discussion on how the choice affects the graph sparsity and include sensitivity analyses in the simulations. revision: yes

  2. Referee: For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).

    Authors: The covariance matrix estimate in CA-MOD is computed from the pooled sample under the null hypothesis, and our derivation shows that the estimation error is negligible in the limit, preserving the extreme-value convergence. However, we acknowledge that this was not explicitly stated. In the revision, we will add a detailed explanation of how the plug-in estimation is accounted for in the asymptotic analysis, including why it does not introduce invalidating dependence. revision: yes

  3. Referee: The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.

    Authors: We appreciate this observation. The phrase 'under some conditions' was intended to refer to standard assumptions such as finite moments of the distance indicators, weak dependence between observations, and appropriate scaling of the threshold with sample size. In the revised manuscript, we will explicitly list these regularity conditions in a dedicated subsection and provide verification for the multivariate case under the null hypothesis. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain.

full rationale

The MOD statistic is constructed directly by pooling observations, computing pairwise distances against a fixed pre-specified threshold, forming per-observation within- and between-connection probabilities, standardizing their squared differences, and taking the maximum. The CA-MOD variant applies a covariance adjustment whose asymptotic convergence to the Type I extreme-value limit is asserted under regularity conditions on the threshold and dependence structure. None of these steps reduces the claimed limit or test by construction to a fitted parameter renamed as a prediction, a self-citation chain, or an ansatz smuggled from prior work; the threshold and covariance estimator are treated as inputs whose consistency is assumed for the mathematical derivation to hold. The procedure therefore remains self-contained as a definition plus conditional asymptotics rather than a tautological loop.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The procedure depends on a user-chosen distance threshold and on covariance estimation for the adjusted version; no other free parameters or invented entities are mentioned.

free parameters (1)
  • threshold value
    Pre-specified cutoff on pairwise distances that defines whether two observations are connected.

pith-pipeline@v0.9.0 · 5542 in / 1255 out tokens · 51113 ms · 2026-05-10T16:54:10.410837+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

  1. [1]

    (1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128

    Andrews, D. (1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128

  2. [2]

    Bai, Z. D. and Saranadasa, H. (1996), Effect of high dimension: By an example of a two sample problem , Statistica Sinica, 6, 311-329

  3. [3]

    Bickel, P. J. (1969), A distribution free version of the smirnov two sample test in the p-variate case , Annals of Mathematical Statistics, 40, 1--23

  4. [4]

    and Xia, Y

    Cai, T., Liu, W. and Xia, Y. (2014) Two-sample test of high dimensional means under dependence , Journal of the Royal Statistical Society Series B (Statistical Methodology), 76, 349-372

  5. [5]

    and Whitelaw, R

    Carpenter, J., Lu, F. and Whitelaw, R. F. (2021) The real value of China's stock market , Journal of Financial Economics, 139, 679--696

  6. [6]

    and Su, Y

    Chen, H., Chen, X. and Su, Y. (2018), A weighted edge-count two-sample test for multivariate and object data , Journal of the American Statistical Association, 113, 1146-1155

  7. [7]

    and Friedman, J

    Chen, H. and Friedman, J. H. (2017), A new graph-based two-sample test for multivariate and object data , Journal of the American Statistical Association, 112, 397-409

  8. [8]

    and Zhang, N

    Chen, H. and Zhang, N. (2013), Graph-based tests for two-sample comparisons of categorical data , Statistica Sinica, 23, 1479-1503

  9. [9]

    and Kengo, K

    Chernozhukov, V., Denis, C. and Kengo, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , Annals of Statistics, 41, 2786-2819

  10. [10]

    L., Lam, T

    Chong, T. L., Lam, T. H. and Yan, K. M. (2012). Is the Chinese stock market really inefficient? , China Economic Review, 23, 122-137

  11. [11]

    and Levendis, J

    Dicle, M. and Levendis, J. (2014), Day-of-the-week effect revisited: International evidence , Journal of Economics and Finance, 38, 407-437

  12. [12]

    and Lazer, D

    Eagle, N., Pentland, A. and Lazer, D. (2009), Inferring friendship network structure by using mobile phone data , Proceedings of the National Academy of Sciences of the USA, 106, 15274-15278

  13. [13]

    and French, K

    Fama, E. and French, K. (1993). Common risk factors in the returns on stocks and bonds, Journal of Financial Economics 33, 3-56

  14. [14]

    and Ma, Y

    Feng, L., Lan, W., Liu, B. and Ma, Y. (2022), High-dimensional test for alpha in linear factor pricing models with sparse alternatives , Journal of Econometrics, 229, 152-175

  15. [15]

    and Rafsky, L

    Friedman, J. and Rafsky, L. (1979), Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests , Annals of Statistics, 7, 697-717

  16. [16]

    Gibbons, J. D. and Chakraborti, S. (2011), Nonparametric Statistical Inference, Springer

  17. [17]

    and Hess, P

    Gibbons, M. and Hess, P. (1981), Day of the week effects and asset returns , Journal of Business, 54, 579-596

  18. [18]

    and Chen

    Guo, H., Zou, C., Wang, Z. and Chen. B. (2014), Empirical likelihood for high-dimensional linear regression models , Metrika, 77, 921-945

  19. [19]

    and Gorfine, M

    Heller, R., Heller, Y., Kaufman, S., Brill, B. and Gorfine, M. (2016), Consistent distribution-free K -sample and independence tests for univariate random variables , Journal of Machine Learning Research, 17, 1-54

  20. [20]

    and Liu, J

    Jiang, B., Chao, Y. and Liu, J. (2015), Non-parametric K -sample tests via dynamic slicing , Journal of the American Statistical Association, 110, 642-653

  21. [21]

    and Pandey, V

    Kohers, G. and Pandey, V. (2004), The disappearing day-of-the-week effect in the world's largest equity markets , Applied Economics Letters, 11, 167-171

  22. [22]

    Lehmann, E. L. (2004), Elements of Large-Sample Theory , Springer

  23. [23]

    and Chen, S

    Li, J. and Chen, S. (2012), Two sample tests for high-dimensional covariance matrices , Annals of Statistics, 40, 908-940

  24. [24]

    and Racine, J

    Li, Q., Maasoumi, E. and Racine, J. S. (2009). A nonparametric test for equality of distributions with mixed categorical and continuous data, Journal of Econometrics, 148, 186-200

  25. [25]

    and Zhu, L

    Li, R., Zhong, W. and Zhu, L. (2012), Feature screening via distance correlation learning , Journal of the American Statistical Association, 107, 1129-1139

  26. [26]

    Mukherjee, S., Agarwal, D., Zhang, N. R. and Bhattacharya, B. (2022), Distribution-free multisample tests based on optimal matchings with applications to single Cell Genomics , Journal of the American Statistical Association, 117, 627-638

  27. [27]

    and Wang, K

    Mukhopadhyay, S. and Wang, K. (2020), A nonparametric approach to high-dimensional k -sample comparison problems , Biometrika, 107, 555-572

  28. [28]

    and Randles, R

    Oja, H. and Randles, R. H. (2004), Multivariate nonparametric tests , Statistical Science, 19, 598-605

  29. [29]

    (2003), Mathematical Statistics, Second Edition, Springer-Verlag New York

    Shao, J. (2003), Mathematical Statistics, Second Edition, Springer-Verlag New York

  30. [30]

    (1964), Capital asset prices: A theory of market equilibrium under conditions of risk , Journal of Finance, 19, 425-444

    Sharpe, W. (1964), Capital asset prices: A theory of market equilibrium under conditions of risk , Journal of Finance, 19, 425-444

  31. [31]

    and Duncan, A

    Wynne, G. and Duncan, A. (2022), A kernel two-sample test for functional data , Journal of Machine Learning Research, 23, 1-51

  32. [32]

    and Yao, F

    Xue, K. and Yao, F. (2020), Distribution and correlation-free two-sample test of high-dimensional means , Annals of Statistics, 48, 1304-1328

  33. [33]

    T., Guo, J

    Zhang, J. T., Guo, J. and Zhou, B. (2024), Testing equality of several distributions in separable metric spaces: A maximum mean discrepancy based approach , Journal of Econometrics, 239, 105286

  34. [34]

    S., Chen, S

    Zhong, P. S., Chen, S. X. and Xu, M. (2013), Tests alternative to higher criticism for high-dimensional means under sparsity and column-wise dependence , Annals of Statistics, 41, 2820-2851. description document