arxiv: 2604.09376 · v1 · submitted 2026-04-10 · 📊 stat.ME

Maximum-of-Differences Test for Comparing Multivariate K-Sample Distributions

Wei Lan , Long Feng , Runze Li , Chih-Ling Tsai This is my paper

Pith reviewed 2026-05-10 16:54 UTC · model grok-4.3

classification 📊 stat.ME

keywords maximum-of-differences testK-sample comparisonmultivariate distributionsnonparametric testextreme value distributionpairwise distancesconnection probabilitiescovariance adjustment

0 comments

The pith

A new maximum-of-differences test compares K multivariate samples by maximizing standardized gaps between within-sample and between-sample connection probabilities from pairwise distances.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a test for whether K multivariate distributions are identical. It pools all observations, builds connections whenever pairwise distances fall below a fixed threshold, and for each point computes the probability of connections staying inside its own sample versus crossing to other samples. The test statistic is the largest standardized squared difference across all points, with a covariance-adjusted variant that has a known limiting distribution. This construction works for any dimension and K, requires no parametric assumptions on the distributions, and extends to regression settings where the goal is to compare conditional distributions.

Core claim

The authors define the MOD statistic as the maximum, over all observations, of the standardized squared difference between the within-sample connection probability and the between-sample connection probability, where two observations are connected if their Euclidean distance is less than a pre-specified threshold. They introduce the covariance-adjusted version CA-MOD whose null limiting distribution is the Type I extreme value distribution under suitable regularity conditions, derive the asymptotic behavior of both statistics under the null and under fixed alternatives, and show that the tests remain applicable to multivariate linear models by replacing raw observations with residuals.

What carries the argument

The MOD statistic, the maximum over observations of the standardized squared difference between within-sample and between-sample connection probabilities induced by a distance-threshold graph on the pooled sample.

If this is right

The test applies directly to any number K of multivariate samples without requiring equal sample sizes.
The same construction yields a test for equality of conditional distributions in a multivariate regression model after replacing observations by residuals.
Under the null, the CA-MOD statistic converges in distribution to the Type I extreme value law, permitting asymptotic p-value calculation.
The test is consistent against alternatives in which at least one observation exhibits a local discrepancy in within-sample versus between-sample connection rates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Choosing the threshold adaptively from the data rather than fixing it in advance could reduce sensitivity to density variations across the support.
The method supplies a graph-based alternative to integral probability metrics for multi-sample testing and may be combined with existing high-dimensional distance measures.
Because the statistic isolates the single most discrepant observation, it can serve as a diagnostic pointer to which sample or region drives rejection.

Load-bearing premise

The procedure depends on a fixed pre-specified distance threshold for declaring connections together with regularity conditions that guarantee the covariance-adjusted statistic converges to the Type I extreme value distribution.

What would settle it

Repeated simulations drawn from identical multivariate distributions in which the empirical distribution of the CA-MOD statistic deviates systematically from the Type I extreme value limit as sample size grows would refute the claimed convergence.

read the original abstract

Comparing $K$-sample distributions is a fundamental problem in data science that arises in a wide variety of fields and applications. In this article, we introduce a maximum-of-differences approach to make such comparisons. Specifically, we first calculate the pairwise distances from the pooled observations of the $K$ samples. We then define the two observations as connected if their distance is less than a pre-specified threshold value. For each observation, we next calculate the ``within" and the ``between" probabilities associated with these two types of connections for the given observation, i.e., with other observations within the same sample and between the given observation and the observations in other samples. Subsequently, we propose a maximum-of-differences (MOD) test that finds the maximum value among the standardized squared differences between the ``within" and the ``between" probabilities of all observations. Accordingly, the proposed test is not only applicable to multivariate data with $K$ samples, but can also be extended to multivariate regression models. Furthermore, we obtain the covariance-adjusted (CA) version of the MOD (CA-MOD) test, which converges to the Type I extreme value distribution under some conditions. Moreover, we demonstrate the asymptotic properties of the two tests under both the null and alternative hypotheses. The performance and usefulness of the tests are illustrated via simulation studies and real examples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable distance-to-connection-probability route to a max-of-differences test for multivariate K-samples, but the pre-specified threshold remains the load-bearing and under-specified piece.

read the letter

The core contribution is a test that pools the data, builds a connection graph from pairwise distances below a fixed threshold, then computes per-observation within-sample and between-sample connection probabilities, standardizes their squared differences, and takes the maximum. A covariance-adjusted version is claimed to converge to a Type I extreme-value limit. The construction is new enough on its face and extends to regression, which is a plus for applied work. Simulations and examples are included to show behavior under null and alternatives, and the asymptotics are stated for both MOD and CA-MOD.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes a maximum-of-differences (MOD) test for comparing K-sample multivariate distributions. Observations are pooled, pairwise distances computed, and pairs declared connected if distance falls below a pre-specified threshold. Per-observation within-sample and between-sample connection probabilities are formed, their squared differences standardized, and the maximum taken as the test statistic. A covariance-adjusted variant (CA-MOD) is introduced that is claimed to converge to the Type I extreme-value distribution under unspecified conditions. Asymptotic properties are derived under both null and alternative hypotheses, the procedure is extended to multivariate regression, and performance is illustrated through simulations and real-data examples.

Significance. If the regularity conditions can be made explicit and a practical rule for threshold selection supplied, the MOD/CA-MOD framework would supply a new nonparametric, graph-based approach to multi-sample multivariate testing that avoids strong parametric assumptions. The claimed extreme-value limit and the extension to regression settings would constitute a genuine methodological contribution provided the supporting derivations are complete and the finite-sample behavior is convincingly documented.

major comments (3)

[Method description / abstract] The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.
[CA-MOD construction] For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).
[Asymptotic results] The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.

minor comments (2)

[Abstract] The abstract states that the method “can also be extended to multivariate regression models” but supplies no concrete description of the extension; the main text should contain at least a brief outline of the necessary modifications.
[Simulation section] Simulation results are mentioned but no tables or figures reporting empirical Type I error rates or power under varying dimensions and sample sizes are referenced in the abstract; these should be added with explicit numerical summaries.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for the constructive feedback on our manuscript. We have carefully considered each of the major comments and provide point-by-point responses below. We plan to incorporate several revisions to address the concerns raised.

read point-by-point responses

Referee: The pre-specified threshold that defines connections (abstract and the method description) is load-bearing: it directly governs the sparsity of the connection graph, the variance of the per-observation probability estimates, and therefore the validity of the standardization step and the subsequent extreme-value convergence. No data-driven selection rule, cross-validation procedure, or dimension-dependent scaling is supplied, leaving the procedure undefined for arbitrary multivariate distributions.

Authors: We agree that providing guidance on threshold selection is important for practical use of the MOD test. Although the current manuscript treats the threshold as a pre-specified parameter (similar to bandwidth in kernel methods), we will revise the manuscript to include a practical data-driven rule, such as selecting the threshold as the median of pairwise distances within a subsample or using a cross-validation approach to optimize the test power. We will also add a discussion on how the choice affects the graph sparsity and include sensitivity analyses in the simulations. revision: yes
Referee: For the CA-MOD statistic the covariance adjustment is performed with an estimate obtained from the same data used to form the test statistic. The manuscript does not clarify whether this estimation is accounted for in the derivation of the null distribution or whether it induces additional dependence that invalidates the claimed Type I extreme-value limit (see the description of the covariance-adjusted version).

Authors: The covariance matrix estimate in CA-MOD is computed from the pooled sample under the null hypothesis, and our derivation shows that the estimation error is negligible in the limit, preserving the extreme-value convergence. However, we acknowledge that this was not explicitly stated. In the revision, we will add a detailed explanation of how the plug-in estimation is accounted for in the asymptotic analysis, including why it does not introduce invalidating dependence. revision: yes
Referee: The regularity conditions required for CA-MOD to converge to the Type I extreme-value distribution are stated only as “under some conditions.” These conditions are never listed explicitly, nor are they verified for the multivariate setting, rendering the asymptotic claim unverifiable from the given material.

Authors: We appreciate this observation. The phrase 'under some conditions' was intended to refer to standard assumptions such as finite moments of the distance indicators, weak dependence between observations, and appropriate scaling of the threshold with sample size. In the revised manuscript, we will explicitly list these regularity conditions in a dedicated subsection and provide verification for the multivariate case under the null hypothesis. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected in the derivation chain.

full rationale

The MOD statistic is constructed directly by pooling observations, computing pairwise distances against a fixed pre-specified threshold, forming per-observation within- and between-connection probabilities, standardizing their squared differences, and taking the maximum. The CA-MOD variant applies a covariance adjustment whose asymptotic convergence to the Type I extreme-value limit is asserted under regularity conditions on the threshold and dependence structure. None of these steps reduces the claimed limit or test by construction to a fitted parameter renamed as a prediction, a self-citation chain, or an ansatz smuggled from prior work; the threshold and covariance estimator are treated as inputs whose consistency is assumed for the mathematical derivation to hold. The procedure therefore remains self-contained as a definition plus conditional asymptotics rather than a tautological loop.

Axiom & Free-Parameter Ledger

1 free parameters · 0 axioms · 0 invented entities

The procedure depends on a user-chosen distance threshold and on covariance estimation for the adjusted version; no other free parameters or invented entities are mentioned.

free parameters (1)

threshold value
Pre-specified cutoff on pairwise distances that defines whether two observations are connected.

pith-pipeline@v0.9.0 · 5542 in / 1255 out tokens · 51113 ms · 2026-05-10T16:54:10.410837+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

(1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128

Andrews, D. (1997), A conditional Kolmogorov test , Econometrica, 65, 1097-1128

work page 1997
[2]

Bai, Z. D. and Saranadasa, H. (1996), Effect of high dimension: By an example of a two sample problem , Statistica Sinica, 6, 311-329

work page 1996
[3]

Bickel, P. J. (1969), A distribution free version of the smirnov two sample test in the p-variate case , Annals of Mathematical Statistics, 40, 1--23

work page 1969
[4]

and Xia, Y

Cai, T., Liu, W. and Xia, Y. (2014) Two-sample test of high dimensional means under dependence , Journal of the Royal Statistical Society Series B (Statistical Methodology), 76, 349-372

work page 2014
[5]

and Whitelaw, R

Carpenter, J., Lu, F. and Whitelaw, R. F. (2021) The real value of China's stock market , Journal of Financial Economics, 139, 679--696

work page 2021
[6]

and Su, Y

Chen, H., Chen, X. and Su, Y. (2018), A weighted edge-count two-sample test for multivariate and object data , Journal of the American Statistical Association, 113, 1146-1155

work page 2018
[7]

and Friedman, J

Chen, H. and Friedman, J. H. (2017), A new graph-based two-sample test for multivariate and object data , Journal of the American Statistical Association, 112, 397-409

work page 2017
[8]

and Zhang, N

Chen, H. and Zhang, N. (2013), Graph-based tests for two-sample comparisons of categorical data , Statistica Sinica, 23, 1479-1503

work page 2013
[9]

and Kengo, K

Chernozhukov, V., Denis, C. and Kengo, K. (2013). Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors , Annals of Statistics, 41, 2786-2819

work page 2013
[10]

L., Lam, T

Chong, T. L., Lam, T. H. and Yan, K. M. (2012). Is the Chinese stock market really inefficient? , China Economic Review, 23, 122-137

work page 2012
[11]

and Levendis, J

Dicle, M. and Levendis, J. (2014), Day-of-the-week effect revisited: International evidence , Journal of Economics and Finance, 38, 407-437

work page 2014
[12]

and Lazer, D

Eagle, N., Pentland, A. and Lazer, D. (2009), Inferring friendship network structure by using mobile phone data , Proceedings of the National Academy of Sciences of the USA, 106, 15274-15278

work page 2009
[13]

and French, K

Fama, E. and French, K. (1993). Common risk factors in the returns on stocks and bonds, Journal of Financial Economics 33, 3-56

work page 1993
[14]

and Ma, Y

Feng, L., Lan, W., Liu, B. and Ma, Y. (2022), High-dimensional test for alpha in linear factor pricing models with sparse alternatives , Journal of Econometrics, 229, 152-175

work page 2022
[15]

and Rafsky, L

Friedman, J. and Rafsky, L. (1979), Multivariate generalizations of the Wald-Wolfowitz and Smirnov two-sample tests , Annals of Statistics, 7, 697-717

work page 1979
[16]

Gibbons, J. D. and Chakraborti, S. (2011), Nonparametric Statistical Inference, Springer

work page 2011
[17]

and Hess, P

Gibbons, M. and Hess, P. (1981), Day of the week effects and asset returns , Journal of Business, 54, 579-596

work page 1981
[18]

and Chen

Guo, H., Zou, C., Wang, Z. and Chen. B. (2014), Empirical likelihood for high-dimensional linear regression models , Metrika, 77, 921-945

work page 2014
[19]

and Gorfine, M

Heller, R., Heller, Y., Kaufman, S., Brill, B. and Gorfine, M. (2016), Consistent distribution-free K -sample and independence tests for univariate random variables , Journal of Machine Learning Research, 17, 1-54

work page 2016
[20]

and Liu, J

Jiang, B., Chao, Y. and Liu, J. (2015), Non-parametric K -sample tests via dynamic slicing , Journal of the American Statistical Association, 110, 642-653

work page 2015
[21]

and Pandey, V

Kohers, G. and Pandey, V. (2004), The disappearing day-of-the-week effect in the world's largest equity markets , Applied Economics Letters, 11, 167-171

work page 2004
[22]

Lehmann, E. L. (2004), Elements of Large-Sample Theory , Springer

work page 2004
[23]

and Chen, S

Li, J. and Chen, S. (2012), Two sample tests for high-dimensional covariance matrices , Annals of Statistics, 40, 908-940

work page 2012
[24]

and Racine, J

Li, Q., Maasoumi, E. and Racine, J. S. (2009). A nonparametric test for equality of distributions with mixed categorical and continuous data, Journal of Econometrics, 148, 186-200

work page 2009
[25]

and Zhu, L

Li, R., Zhong, W. and Zhu, L. (2012), Feature screening via distance correlation learning , Journal of the American Statistical Association, 107, 1129-1139

work page 2012
[26]

Mukherjee, S., Agarwal, D., Zhang, N. R. and Bhattacharya, B. (2022), Distribution-free multisample tests based on optimal matchings with applications to single Cell Genomics , Journal of the American Statistical Association, 117, 627-638

work page 2022
[27]

and Wang, K

Mukhopadhyay, S. and Wang, K. (2020), A nonparametric approach to high-dimensional k -sample comparison problems , Biometrika, 107, 555-572

work page 2020
[28]

and Randles, R

Oja, H. and Randles, R. H. (2004), Multivariate nonparametric tests , Statistical Science, 19, 598-605

work page 2004
[29]

(2003), Mathematical Statistics, Second Edition, Springer-Verlag New York

Shao, J. (2003), Mathematical Statistics, Second Edition, Springer-Verlag New York

work page 2003
[30]

(1964), Capital asset prices: A theory of market equilibrium under conditions of risk , Journal of Finance, 19, 425-444

Sharpe, W. (1964), Capital asset prices: A theory of market equilibrium under conditions of risk , Journal of Finance, 19, 425-444

work page 1964
[31]

and Duncan, A

Wynne, G. and Duncan, A. (2022), A kernel two-sample test for functional data , Journal of Machine Learning Research, 23, 1-51

work page 2022
[32]

and Yao, F

Xue, K. and Yao, F. (2020), Distribution and correlation-free two-sample test of high-dimensional means , Annals of Statistics, 48, 1304-1328

work page 2020
[33]

T., Guo, J

Zhang, J. T., Guo, J. and Zhou, B. (2024), Testing equality of several distributions in separable metric spaces: A maximum mean discrepancy based approach , Journal of Econometrics, 239, 105286

work page 2024
[34]

S., Chen, S

Zhong, P. S., Chen, S. X. and Xu, M. (2013), Tests alternative to higher criticism for high-dimensional means under sparsity and column-wise dependence , Annals of Statistics, 41, 2820-2851. description document

work page 2013