Block-Independent Likelihood Ratio Testing for High-Dimensional Mean Vectors with Applications to Matrix-Variate Data

Johan Lim; Kwangok Seo; Minsub Shin; Sang Han Lee

arxiv: 2605.21848 · v1 · pith:CIWBV52Wnew · submitted 2026-05-21 · 📊 stat.ME

Block-Independent Likelihood Ratio Testing for High-Dimensional Mean Vectors with Applications to Matrix-Variate Data

Minsub Shin , Kwangok Seo , Sang Han Lee , Johan Lim This is my paper

Pith reviewed 2026-05-22 04:59 UTC · model grok-4.3

classification 📊 stat.ME

keywords high-dimensional mean testinglikelihood ratio testblock independenceasymptotic normalitymatrix-variate datapower analysis

0 comments

The pith

The Block Independent Likelihood Ratio Test improves power over diagonal methods for high-dimensional mean vectors by assuming only block-wise independence in the covariance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper proposes the Block Independent Likelihood Ratio Test (BILT) to compare two high-dimensional mean vectors when the number of variables p is large compared to the sample size n. It relaxes the full independence assumption used in prior diagonal likelihood ratio tests to a milder block independence structure that allows correlations within blocks. The authors derive the asymptotic normality of the BILT statistic under the null hypothesis in the regime of increasing p with small n, and they obtain its limiting distribution under local alternatives for power calculations. Simulations across varied covariance patterns confirm that BILT controls type I error while delivering higher power than the diagonal version, and the approach is illustrated on matrix-variate neuroimaging data.

Core claim

By replacing the working independence assumption with a block independence assumption on the covariance matrix, the resulting likelihood ratio statistic for testing equality of two high-dimensional means converges in distribution to a standard normal under the null when p grows while n remains small, and it attains higher asymptotic power than tests that force complete diagonalization.

What carries the argument

The BILT statistic constructed from the block-diagonal approximation to the pooled covariance matrix under the block independence assumption.

If this is right

The null distribution of the BILT statistic is standard normal asymptotically when dimension increases with small sample size under mild regularity conditions.
The test possesses a non-central normal limit under local alternatives, permitting explicit power formulas.
BILT maintains type I error control and shows substantially higher power than the diagonal likelihood ratio test across a range of covariance structures in finite samples.
The procedure extends immediately to matrix-variate observations, as shown by its application to two-group comparison in the ADNI dataset.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If domain knowledge supplies reliable block partitions, the same relaxation could improve other high-dimensional procedures that currently rely on strict independence.
Data-driven methods for choosing or refining the blocks might further increase power while preserving the asymptotic normality result.
Similar block-structured assumptions could be useful for related problems such as covariance testing or regression in the large-p-small-n setting.

Load-bearing premise

The variables admit a known or correctly specified partition into blocks such that observations in different blocks are independent and the within-block dependence does not distort the limiting null distribution.

What would settle it

A Monte Carlo experiment in which the true covariance violates the proposed block partition yet the test is applied anyway, checking whether the empirical type I error rate stays near the nominal level or the power advantage disappears.

Figures

Figures reproduced from arXiv: 2605.21848 by Johan Lim, Kwangok Seo, Minsub Shin, Sang Han Lee.

**Figure 2.** Figure 2: Type I error of BILT with n1 = n2 = 50. Each panel corresponds to a covariance structure. Within each panel, results for block sizes b ∈ {1, 2, 5, 10} are displayed. The x-axis represents the dimension p, and the y-axis represents the Type I error obtained from 3,000 replications. The horizontal dashed line indicates the nominal Type I error level of 0.05. in Appendix D, From [PITH_FULL_IMAGE:figures/full… view at source ↗

**Figure 3.** Figure 3: Power of BILT with n1 = n2 = 50 and p = 500. Each panel corresponds to a covariance structure. Within each panel, results for block sizes b ∈ {1, 2, 5, 10} are displayed. The x-axis represents the signal magnitude δ/√p, and the y-axis represents the power obtained from 3,000 replications. The horizontal dashed line indicates the nominal Type I error level of 0.05. 4.4 Power Curve Against Non-Null Proportio… view at source ↗

**Figure 4.** Figure 4: Power of BILT with n1 = n2 = 50 and p = 1,000. Each panel corresponds to a covariance structure. Within each panel, results for block sizes b ∈ {1, 2, 5, 10} are displayed. The x-axis represents the non-null proportion, and the y-axis represents the power obtained from 3,000 replications. The horizontal dashed line indicates the nominal Type I error level of 0.05. 4.5 Comparison with Other Methods Finally… view at source ↗

**Figure 5.** Figure 5: Power comparison of BS, SD, CQ, ARHT, aSPU, DLRT, and BILT [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: (1) A brain MRI image with the corpus callosum (CC) highlighted by its outline. (2) The [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of the block structures used in BILT [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

read the original abstract

Testing the equality of two high-dimensional mean vectors is a fundamental problem in multivariate analysis. While the classical Hotelling's $T^2$ test is optimal in low-dimensional settings, it fails when the dimension $p$ is comparable to or exceeds the sample size $n$. Several extensions, including the Diagonal Likelihood Ratio Test (DLRT), have been proposed under the working independence assumption among variables. However, such an assumption can lead to a substantial loss of power when correlations are present. In this paper, we propose a new test, the Block Independent Likelihood Ratio Test (BILT), which generalizes DLRT by relaxing the working independence assumption to a block independence assumption. We establish its asymptotic normality of the null distribution of the BILT statistic for 'increasing $p$ with small $n$' under mild regularity conditions. We further analyze the asymptotic power of BILT under a local alternatives. Extensive simulation studies show that BILT maintains Type I error control and achieves substantially higher power than DLRT across a wide range of covariance structures. An application to the Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset further demonstrates the application of BILT to testing mean differences between two matrix-variate populations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

BILT improves power over DLRT by allowing block independence in high-dimensional mean tests, but the asymptotic normality rests on blocks being fixed and correctly specified.

read the letter

The main takeaway is that this paper gives a middle-ground test between full independence and full covariance modeling for high-dimensional mean vectors. It generalizes the diagonal likelihood ratio test by summing likelihood ratios over blocks that are assumed independent of each other. They derive asymptotic normality of the null distribution when p grows with n fixed or small, plus the local power behavior, and the simulations show it controls type I error while delivering higher power than DLRT across several covariance patterns. The ADNI example shows it can be applied directly to matrix-variate data in neuroimaging settings. That combination of extension, theory, and practical checks is the useful part. The soft spot is the block structure. The null distribution derivation treats the partition as known and exact, so the covariance is block-diagonal under the working assumption. In applications the blocks would usually need to be chosen or estimated from the same data, which adds variability the mild regularity conditions do not appear to bound. If that extra term is not negligible, the claimed normality can shift. The paper would be tighter if it either fixed the blocks in advance or supplied a selection rule with its own approximation result. This is aimed at statisticians working on high-dimensional multivariate tests who already know DLRT and want a modest power gain without estimating the full covariance. Readers who run simulations or analyze structured data like images will see the most direct value. The work is coherent on its own terms and rests on standard asymptotic arguments, so it deserves a serious referee rather than a desk rejection. I would send it out for review.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes the Block Independent Likelihood Ratio Test (BILT) as a generalization of the Diagonal Likelihood Ratio Test (DLRT) for testing equality of two high-dimensional mean vectors. It relaxes the full independence assumption to a block-independence structure on the covariance, establishes asymptotic normality of the BILT statistic under the null for the regime of increasing dimension p with small sample size n under mild regularity conditions, derives the asymptotic power under local alternatives, reports simulation results showing Type I error control and power gains over DLRT across covariance structures, and applies the method to matrix-variate data from the ADNI study.

Significance. If the asymptotic normality and power results hold under the stated conditions with blocks either known or estimated without material error, the work offers a practical improvement over DLRT by incorporating within-block dependence, which can yield meaningful power gains in correlated high-dimensional settings. The matrix-variate application is a clear strength, and the simulation design appears to cover a range of structures. The contribution would be more substantial if the block-handling procedure were shown to preserve the limiting null distribution.

major comments (2)

[Asymptotic normality section (null distribution)] The derivation of asymptotic normality for the BILT statistic (under increasing p, small n) treats the block partition as fixed and correctly specified so that the covariance is exactly block-diagonal. If blocks must be chosen or estimated from the same data, the resulting statistic is no longer a function of independent block-wise likelihood ratios; the asymptotic normality claim then rests on an unstated uniform approximation that ignores the additional variability from block selection. The mild regularity conditions cited do not explicitly bound this extra term.
[Asymptotic power analysis] The power analysis under local alternatives similarly assumes the block structure is known without estimation error. It is unclear whether the local-alternative power expression remains valid when blocks are data-driven, as the extra variability could alter the non-centrality parameter.

minor comments (2)

[Introduction / Assumptions] The abstract and introduction refer to 'mild regularity conditions' without listing them explicitly; a dedicated subsection or remark stating the precise assumptions (e.g., on moments, block sizes, and eigenvalue bounds) would improve readability.
[Simulation studies] Simulation tables would benefit from reporting the exact block-selection method used in each scenario and the resulting average block sizes, to allow readers to assess sensitivity to misspecification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below, clarifying the scope of our asymptotic results.

read point-by-point responses

Referee: [Asymptotic normality section (null distribution)] The derivation of asymptotic normality for the BILT statistic (under increasing p, small n) treats the block partition as fixed and correctly specified so that the covariance is exactly block-diagonal. If blocks must be chosen or estimated from the same data, the resulting statistic is no longer a function of independent block-wise likelihood ratios; the asymptotic normality claim then rests on an unstated uniform approximation that ignores the additional variability from block selection. The mild regularity conditions cited do not explicitly bound this extra term.

Authors: We agree that the asymptotic normality derivation assumes the block partition is fixed and known a priori, so that the covariance is exactly block-diagonal and the block-wise likelihood ratio components are independent. The mild regularity conditions suffice for this setting, allowing direct application of a central limit theorem to the normalized sum of the block contributions. We do not claim the result extends automatically to data-driven block selection. In the revised manuscript we will state this assumption explicitly in the theorem and add a remark in the discussion acknowledging that block estimation introduces extra variability not controlled by the current conditions. We view a full extension to estimated blocks as an interesting direction for future work rather than part of the present contribution. revision: yes
Referee: [Asymptotic power analysis] The power analysis under local alternatives similarly assumes the block structure is known without estimation error. It is unclear whether the local-alternative power expression remains valid when blocks are data-driven, as the extra variability could alter the non-centrality parameter.

Authors: The local-alternative power analysis is likewise derived under a known, fixed block structure; the non-centrality parameter is expressed in terms of the block-wise mean differences and the block-diagonal covariance. When blocks are estimated from the data the non-centrality could be perturbed, and the current closed-form expression would require additional justification. In the revision we will explicitly note that the power results hold for known blocks and briefly discuss the potential effect of block estimation as a limitation of the present analysis. revision: yes

Circularity Check

0 steps flagged

No circularity: asymptotic normality follows from standard arguments under stated block-independence assumption

full rationale

The paper derives the null distribution of the BILT statistic via standard asymptotic arguments for high-dimensional settings with increasing p and small n, under the explicit block-independence assumption and mild regularity conditions. The central claim does not reduce by construction to a fitted parameter or self-citation chain; the block partition is treated as given or correctly specified without introducing estimation variability into the limiting distribution. This is the most common honest finding for papers that rely on classical limit theorems rather than data-driven renormalization of the test statistic itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard high-dimensional asymptotic theory and the block-independence modeling assumption; no free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption Mild regularity conditions hold for the asymptotic normality result under increasing p with small n
Invoked to establish the null distribution of the BILT statistic.

pith-pipeline@v0.9.0 · 5752 in / 1143 out tokens · 48592 ms · 2026-05-22T04:59:53.789299+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith.Foundation.RealityFromDistinction reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a new test, the Block Independent Likelihood Ratio Test (BILT), which generalizes DLRT by relaxing the working independence assumption to a block independence assumption. We establish its asymptotic normality of the null distribution of the BILT statistic for 'increasing p with small n' under mild regularity conditions.
IndisputableMonolith.Cost.FunctionalEquation washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The BILT statistic is constructed as a likelihood ratio under the working block independence assumption... Let U_{N,k} = N log(1 + A_{N,k}/(N-2)).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages

[1]

Biometrics , volume=

Diagonal likelihood ratio test for equality of mean vectors in high-dimensional data , author=. Biometrics , volume=

work page
[2]

The Annals of Probability , volume=

On the Central Limit Theorem for Stationary Mixing Random Fields , author=. The Annals of Probability , volume=

work page
[3]

Journal of Multivariate Analysis , volume=

An overview of tests on high-dimensional means , author=. Journal of Multivariate Analysis , volume=

work page
[4]

Statistica Sinica , volume=

Effect of high dimension: by an example of a two sample problem , author=. Statistica Sinica , volume=

work page
[5]

The Annals of Statistics , volume=

A two-sample test for high-dimensional data with applications to gene-set testing , author=. The Annals of Statistics , volume=

work page
[6]

The Annals of Statistics , volume=

Two-sample and ANOVA tests for high dimensional means , author=. The Annals of Statistics , volume=

work page
[7]

Journal of Multivariate Analysis , volume=

A test for the mean vector with fewer observations than the dimension , author=. Journal of Multivariate Analysis , volume=

work page
[8]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Two-sample test of high dimensional means under dependence , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

work page
[9]

Biometrika , volume=

An adaptive two-sample test for high-dimensional means , author=. Biometrika , volume=

work page
[10]

A regularized

Chen, Lin S and Paul, Debashis and Prentice, Ross L and Wang, Pei , journal=. A regularized

work page
[11]

The Generalization of

Harold Hotelling , journal =. The Generalization of

work page
[12]

The Annals of Mathematical Statistics , volume =

A High Dimensional Two Sample Significance Test , author=. The Annals of Mathematical Statistics , volume =

work page
[13]

Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=

Predicting progression from mild cognitive impairment to Alzheimer's disease using longitudinal callosal atrophy , author=. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=

work page
[14]

Journal of the American Statistical Association , volume=

Penalized normal likelihood and ridge regularization of correlation and covariance matrices , author=. Journal of the American Statistical Association , volume=

work page
[15]

Regularized

Choi, Young Geun and Ng, Chi Tim and Lim, Johan , journal=. Regularized

work page
[16]

1958 , publisher=

An introduction to multivariate statistical analysis , author=. 1958 , publisher=

work page 1958
[17]

Statistica Sinica , volume=

Asymptotically efficient parameter estimation in hidden markov spatio-temporal random fields , author=. Statistica Sinica , volume=

work page
[18]

Econometrica , volume=

Heteroskedasticity and autocorrelation consistent covariance matrix estimation , author=. Econometrica , volume=

work page
[19]

Biometrics , volume=

A multivariate two-sample mean test for small sample size and missing data , author=. Biometrics , volume=

work page
[20]

Journal of Multivariate Analysis , volume=

A test for the mean vector with fewer observations than the dimension under non-normality , author=. Journal of Multivariate Analysis , volume=

work page
[21]

Yates, Phillip D and Reimers, Mark A , journal=

work page
[22]

Advances in Neural Information Processing Systems , volume=

A more powerful two-sample test in high dimensions using random projection , author=. Advances in Neural Information Processing Systems , volume=. 2011 , booktitle =

work page 2011
[23]

Journal of the American Statistical Association , volume=

Projection test for mean vector in high dimensions , author=. Journal of the American Statistical Association , volume=

work page
[24]

Huang, Yuan , title=

work page
[25]

Srivastava, Radhendushka and Li, Ping and Ruppert, David , journal=

work page
[26]

Shrinkage-based diagonal

Dong, Kai and Pang, Herbert and Tong, Tiejun and Genton, Marc G , journal=. Shrinkage-based diagonal

work page
[27]

A generalized asymmetric

Zhu, Dongming and Galbraith, John W , journal=. A generalized asymmetric

work page
[28]

2014 , publisher=

Martingale Limit Theory and Its Application , author=. 2014 , publisher=

work page 2014
[29]

The Annals of Mathematical Statistics , volume=

On consistent estimates of the spectrum of a stationary time series , author=. The Annals of Mathematical Statistics , volume=

work page
[30]

Econometrica , volume=

A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix , author=. Econometrica , volume=

work page
[31]

Econometric Theory , volume=

Consistent covariance matrix estimation for linear processes , author=. Econometric Theory , volume=

work page
[32]

An adaptable generalization of

Li, Haoran and Aue, Alexander and Paul, Debashis and Peng, Jie and Wang, Pei , journal=. An adaptable generalization of

work page
[33]

Biometrika , volume=

A note on composite likelihood inference and model selection , author=. Biometrika , volume=

work page
[34]

Statistica Sinica , volume=

Composite likelihood inference under boundary conditions , author=. Statistica Sinica , volume=

work page

[1] [1]

Biometrics , volume=

Diagonal likelihood ratio test for equality of mean vectors in high-dimensional data , author=. Biometrics , volume=

work page

[2] [2]

The Annals of Probability , volume=

On the Central Limit Theorem for Stationary Mixing Random Fields , author=. The Annals of Probability , volume=

work page

[3] [3]

Journal of Multivariate Analysis , volume=

An overview of tests on high-dimensional means , author=. Journal of Multivariate Analysis , volume=

work page

[4] [4]

Statistica Sinica , volume=

Effect of high dimension: by an example of a two sample problem , author=. Statistica Sinica , volume=

work page

[5] [5]

The Annals of Statistics , volume=

A two-sample test for high-dimensional data with applications to gene-set testing , author=. The Annals of Statistics , volume=

work page

[6] [6]

The Annals of Statistics , volume=

Two-sample and ANOVA tests for high dimensional means , author=. The Annals of Statistics , volume=

work page

[7] [7]

Journal of Multivariate Analysis , volume=

A test for the mean vector with fewer observations than the dimension , author=. Journal of Multivariate Analysis , volume=

work page

[8] [8]

Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

Two-sample test of high dimensional means under dependence , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

work page

[9] [9]

Biometrika , volume=

An adaptive two-sample test for high-dimensional means , author=. Biometrika , volume=

work page

[10] [10]

A regularized

Chen, Lin S and Paul, Debashis and Prentice, Ross L and Wang, Pei , journal=. A regularized

work page

[11] [11]

The Generalization of

Harold Hotelling , journal =. The Generalization of

work page

[12] [12]

The Annals of Mathematical Statistics , volume =

A High Dimensional Two Sample Significance Test , author=. The Annals of Mathematical Statistics , volume =

work page

[13] [13]

Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=

Predicting progression from mild cognitive impairment to Alzheimer's disease using longitudinal callosal atrophy , author=. Alzheimer's & Dementia: Diagnosis, Assessment & Disease Monitoring , volume=

work page

[14] [14]

Journal of the American Statistical Association , volume=

Penalized normal likelihood and ridge regularization of correlation and covariance matrices , author=. Journal of the American Statistical Association , volume=

work page

[15] [15]

Regularized

Choi, Young Geun and Ng, Chi Tim and Lim, Johan , journal=. Regularized

work page

[16] [16]

1958 , publisher=

An introduction to multivariate statistical analysis , author=. 1958 , publisher=

work page 1958

[17] [17]

Statistica Sinica , volume=

Asymptotically efficient parameter estimation in hidden markov spatio-temporal random fields , author=. Statistica Sinica , volume=

work page

[18] [18]

Econometrica , volume=

Heteroskedasticity and autocorrelation consistent covariance matrix estimation , author=. Econometrica , volume=

work page

[19] [19]

Biometrics , volume=

A multivariate two-sample mean test for small sample size and missing data , author=. Biometrics , volume=

work page

[20] [20]

Journal of Multivariate Analysis , volume=

A test for the mean vector with fewer observations than the dimension under non-normality , author=. Journal of Multivariate Analysis , volume=

work page

[21] [21]

Yates, Phillip D and Reimers, Mark A , journal=

work page

[22] [22]

Advances in Neural Information Processing Systems , volume=

A more powerful two-sample test in high dimensions using random projection , author=. Advances in Neural Information Processing Systems , volume=. 2011 , booktitle =

work page 2011

[23] [23]

Journal of the American Statistical Association , volume=

Projection test for mean vector in high dimensions , author=. Journal of the American Statistical Association , volume=

work page

[24] [24]

Huang, Yuan , title=

work page

[25] [25]

Srivastava, Radhendushka and Li, Ping and Ruppert, David , journal=

work page

[26] [26]

Shrinkage-based diagonal

Dong, Kai and Pang, Herbert and Tong, Tiejun and Genton, Marc G , journal=. Shrinkage-based diagonal

work page

[27] [27]

A generalized asymmetric

Zhu, Dongming and Galbraith, John W , journal=. A generalized asymmetric

work page

[28] [28]

2014 , publisher=

Martingale Limit Theory and Its Application , author=. 2014 , publisher=

work page 2014

[29] [29]

The Annals of Mathematical Statistics , volume=

On consistent estimates of the spectrum of a stationary time series , author=. The Annals of Mathematical Statistics , volume=

work page

[30] [30]

Econometrica , volume=

A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix , author=. Econometrica , volume=

work page

[31] [31]

Econometric Theory , volume=

Consistent covariance matrix estimation for linear processes , author=. Econometric Theory , volume=

work page

[32] [32]

An adaptable generalization of

Li, Haoran and Aue, Alexander and Paul, Debashis and Peng, Jie and Wang, Pei , journal=. An adaptable generalization of

work page

[33] [33]

Biometrika , volume=

A note on composite likelihood inference and model selection , author=. Biometrika , volume=

work page

[34] [34]

Statistica Sinica , volume=

Composite likelihood inference under boundary conditions , author=. Statistica Sinica , volume=

work page