Testing linear combinations of multiple variance components

Alex Stringer; Jeffrey Negrea

arxiv: 2604.25744 · v1 · submitted 2026-04-28 · 📊 stat.ME

Testing linear combinations of multiple variance components

Alex Stringer , Jeffrey Negrea This is my paper

Pith reviewed 2026-05-07 15:17 UTC · model grok-4.3

classification 📊 stat.ME

keywords variance componentsparametric bootstraphypothesis testinglinear contrastsGaussian modelsmixed effectsresidual log-likelihoodrandom effects

0 comments

The pith

A parametric bootstrap procedure tests simultaneous linear contrasts of multiple variance components equaling zero in Gaussian models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a parametric bootstrap method to test whether certain linear combinations of variance components are all zero at once. This covers Gaussian models for experimental designs with multiple random effects, including nested and crossed structures. The approach includes an efficient decomposition of the residual log-likelihood that avoids requiring variance components to be non-negative or design matrices to be positive semi-definite, plus a modified Newton optimizer and constrained sampling under the null. A notable special case fills a prior gap by enabling a test for several variance components equaling zero simultaneously. Researchers gain a flexible tool for joint hypotheses on sources of variation that standard likelihood ratio tests could not handle in general.

Core claim

We test the hypothesis that simultaneous linear contrasts of multiple variance components equal zero in a Gaussian variance components model via a parametric bootstrap. The main technical contributions are a computationally efficient decomposition of the normalized residual log-likelihood that does not require the variance components to be non-negative or variance design matrices to be positive semi-definite, a modified Newton method for its minimization, and a method for efficient optimization and sampling under the null hypothesis that certain linear combinations of variance components equal zero. A special case of the proposed procedure is a test for multiple variance components simulatne

What carries the argument

Parametric bootstrap test for linear contrasts of variance components, enabled by an efficient decomposition of the normalized residual log-likelihood that avoids non-negativity and positive semi-definiteness constraints.

Load-bearing premise

The observations follow a Gaussian variance components model and the parametric bootstrap accurately approximates the null distribution of the test statistic for the linear contrasts.

What would settle it

A simulation study in which the test rejects the null at rates substantially different from the nominal significance level when the null is true would show the bootstrap approximation fails.

Figures

Figures reproduced from arXiv: 2604.25744 by Alex Stringer, Jeffrey Negrea.

**Figure 1.** Figure 1: Empirical power (with Monte Carlo standard error bands) for testing view at source ↗

**Figure 2.** Figure 2: Empirical power for testing H0 : τ1 = τ2 against the two-sided alternative H1 : τ1 ̸= τ2 in 1000 simulated datasets from the crossed model in Eq. (21) for each choice of the two sample sizes m and n and common τ values, and for a balanced design as well as the unbalanced design with ρ = 0, 0.5. In the designs with more unbalanced sample sizes the power decreases for less balanced effects. 15 view at source ↗

**Figure 3.** Figure 3: Bootstrapped sampling distributions of four functions of view at source ↗

**Figure 4.** Figure 4: Bootstrapped sampling distributions of four functions of view at source ↗

**Figure 5.** Figure 5: Bootstrapped sampling distributions of four functions of view at source ↗

**Figure 6.** Figure 6: Empirical p-values for testing H0 : τ1 = τ2 against the two-sided alternative H1 : τ1 ̸= τ2 in 1000 simulated datasets from the nested model for each choice of the three sample sizes m, n, and r and common τ values in a balanced design. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the location of maximum dsicrep… view at source ↗

**Figure 7.** Figure 7: Empirical p-values for testing H0 : τ1 = τ2 against the one-sided alternative H1 : τ1 > τ2 in 1000 simulated datasets from the nested model for each choice of the three sample sizes m, n, and r and common τ values in a balanced design. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the location of maximum dsicrepa… view at source ↗

**Figure 8.** Figure 8: KS test statistics for uniformity of the empirical p-values for testing view at source ↗

**Figure 9.** Figure 9: Estimated common value QT 2 τb = (1/2)(τb1+τb2) in 1000 simulated datasets from the nested model for each choice of the three sample sizes m, n, and r and common τ values in a balanced design. The null hypothesis of τ1 = τ2 is true in each simulated dataset. Larger, more balanced data tend to result in less uncertainty. Horizontal lines show the true value of τ1 = τ2 for each simulation. 26 view at source ↗

**Figure 10.** Figure 10: Empirical power (with Monte Carlo standard error bands) for testing view at source ↗

**Figure 11.** Figure 11: Empirical power (with Monte Carlo standard error bands) for testing view at source ↗

**Figure 12.** Figure 12: Empirical p-values for testing H0 : τ1 = τ2 against the two-sided alternative H1 : τ1 ̸= τ2 in 1000 simulated datasets from the nested model with an unbalanced design having m groups of average size n and average number of replications r, and common τ values. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the loc… view at source ↗

**Figure 13.** Figure 13: Empirical p-values for testing H0 : τ1 = τ2 against the one-sided alternative H1 : τ1 > τ2 in 1000 simulated datasets from the nested model with an unbalanced design havin m groups of average size n and average number of replications r, and common τ values. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the locat… view at source ↗

**Figure 14.** Figure 14: KS test statistics for uniformity of the empirical p-values for testing view at source ↗

**Figure 15.** Figure 15: Estimated common value QT 2 τb = (1/2)(τb1 + τb2) in 1000 simulated datasets from the nested model for each choice of the three sample sizes m, n, and r and common τ values in an unbalanced design. The null hypothesis of τ1 = τ2 is true in each simulated dataset. Larger, more balanced data tend to result in less uncertainty. Horizontal lines show the true value of τ1 = τ2 for each simulation. 33 view at source ↗

**Figure 16.** Figure 16: Empirical power (with Monte Carlo standard error bands) for testing view at source ↗

**Figure 17.** Figure 17: Empirical power (with Monte Carlo standard error bands) for testing view at source ↗

**Figure 18.** Figure 18: Empirical p-values for testing H0 : τ1 = τ2 against the two-sided alternative H1 : τ1 ̸= τ2 in 1000 simulated datasets from the crossed model for each choice of the two sample sizes m and n, common τ values, and balance parameter r. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the location of maximum dsicrepanc… view at source ↗

**Figure 19.** Figure 19: Empirical p-values for testing H0 : τ1 = τ2 against the one-sided alternative H1 : τ1 > τ2 in 1000 simulated datasets from the crossed model for each choice of the two sample sizes m and n, common τ values, and balance parameter r. The null hypothesis is true in each simulated dataset and the common τ1 = τ2 values are indicated by separate lines. Vertical lines indicate the location of maximum dsicrepancy… view at source ↗

**Figure 20.** Figure 20: KS test statistics for uniformity of the empirical p-values for testing view at source ↗

**Figure 21.** Figure 21: Estimated common value QT 2 τb = (1/2)(τb1 + τb2) in 1000 simulated datasets from the crossed model for each choice of the two sample sizes m and n, common τ values, and balance parameter r. The null hypothesis is true in each simulated dataset. 39 view at source ↗

**Figure 22.** Figure 22: Empirical power for testing H0 : τ1 = τ2 against the one-sided alternative H1 : τ1 > τ2 in 1000 simulated datasets from the crossed model for each choice of the two sample sizes m and n and common τ values and for each choice of balance parameter r = −1, 0, 0.5. In the designs with more unbalanced sample sizes the power decreases for less balanced effects. 41 view at source ↗

read the original abstract

We test the hypothesis that simulataneous linear contrasts of multiple variance components equal zero in a Gaussian variance components model via a parametric bootstrap. Applications include but are not limited to nested and crossed designs. The main technical contributions are a computationally efficient decomposition of the normalized residual log-likelihood that does not require the variance components to be non-negative or variance design matrices to be positive semi-definite, a modified Newton method for its minimization, and a method for efficient optimization and sampling under the null hypothesis that certain linear combinations of variance components equal zero. A special case of the proposed procedure is a test for multiple variance components simulataneously equalling zero, for which a likelihood ratio test was not previously available. However, the proposed procedure is significantly more general.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a parametric bootstrap for testing linear contrasts of variance components in mixed models, including a new special case for several components at zero, but the claimed decomposition without PSD requirements on V needs direct checking against the Gaussian likelihood definition.

read the letter

This paper gives a parametric bootstrap for testing linear contrasts of variance components in mixed models, including a new special case for several components at zero, but the claimed decomposition without PSD requirements on V needs direct checking against the Gaussian likelihood definition. They decompose the normalized residual log-likelihood so minimization and sampling under the null do not force non-negative components or positive semi-definite G_i matrices, then add a modified Newton method and efficient null-constrained optimization. The special case fills a gap where no LRT existed before, and the general procedure covers nested and crossed designs without the usual boundary problems. The computational efficiency looks like the real practical gain for people who fit these models regularly. The bootstrap from the fitted null is a clean way to get p-values for the contrasts. The soft spot is the stress-test point on positive semi-definiteness. The Gaussian log-likelihood requires log det(V) and the quadratic form to be defined over the reals, so V must stay PSD. If the decomposition lets theta go negative or G_i lose semi-definiteness in intermediate steps, the objective can become undefined and the Newton iterates or bootstrap samples can break. The abstract states the method works without those constraints, so either they implicitly stay in the valid region or the decomposition has a safeguard that is not obvious from the summary. That needs explicit proof or simulation evidence in the full text. This is for applied statisticians and methodologists who test variance components in longitudinal or genetic data. A reader who needs to implement such tests will get concrete code-level ideas if the PSD handling is solid. It deserves peer review because the gap is real and the computational contribution is concrete, even if the likelihood definition under relaxed constraints requires clarification or a small fix.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a parametric bootstrap procedure for testing hypotheses that linear combinations of multiple variance components equal zero in a Gaussian variance components model. Central technical contributions include a computationally efficient decomposition of the normalized residual log-likelihood that does not require non-negative variance components or positive semi-definite design matrices, a modified Newton method for its minimization, and methods for efficient optimization and sampling under the null. A highlighted special case is simultaneous testing of multiple variance components equaling zero, for which a likelihood ratio test was previously unavailable; the procedure is presented as more general and applicable to nested and crossed designs.

Significance. If the core procedure is valid, the work would provide a useful general tool for hypothesis testing on linear contrasts of variance components in linear mixed models, extending beyond existing methods limited to individual components or specific designs. The claimed computational efficiency, avoidance of non-negativity constraints, and availability of a test where LRT was unavailable are potential strengths for practical applications in statistics.

major comments (1)

[Abstract] Abstract and method description: the central claim that the decomposition of the normalized residual log-likelihood does not require non-negative variance components or positive semi-definite design matrices G_i conflicts with the requirements of the Gaussian model. For the log-likelihood (log det(V) and quadratic form) to be defined over the reals, V = sum theta_i G_i must remain positive semi-definite. If the decomposition permits regions with negative eigenvalues, the objective function, Newton minimization, and parametric bootstrap samples under the linear null become ill-posed. This is load-bearing for both the special-case test of multiple components equaling zero and the general linear-contrast tests, and requires explicit clarification or safeguards.

minor comments (2)

[Abstract] Typo: 'simulataneous' should be 'simultaneous'.
[Abstract] Typo: 'simulataneously' should be 'simultaneously'.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for raising this important point concerning the domain of the log-likelihood. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract and method description: the central claim that the decomposition of the normalized residual log-likelihood does not require non-negative variance components or positive semi-definite design matrices G_i conflicts with the requirements of the Gaussian model. For the log-likelihood (log det(V) and quadratic form) to be defined over the reals, V = sum theta_i G_i must remain positive semi-definite. If the decomposition permits regions with negative eigenvalues, the objective function, Newton minimization, and parametric bootstrap samples under the linear null become ill-posed. This is load-bearing for both the special-case test of multiple components equaling zero and the general linear-contrast tests, and requires explicit clarification or safeguards.

Authors: We agree that the Gaussian log-likelihood is defined only when V = sum theta_i G_i is positive semi-definite. The decomposition itself is an exact algebraic identity for the normalized residual log-likelihood that can be written without embedding non-negativity of the theta_i or positive semi-definiteness of the individual G_i into the formula. This is the sense in which the decomposition does not require those conditions. In all numerical work, however, the modified Newton minimization and the parametric bootstrap are restricted to the region where V is positive semi-definite; this is enforced by the line-search and by sampling only from valid null distributions. We will revise the abstract and the relevant methodological sections to state this domain restriction explicitly and to describe the safeguards that prevent evaluation at points with negative eigenvalues. The revision will cover both the general linear-contrast tests and the special case of simultaneous zero tests. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on standard likelihood decomposition and external parametric bootstrap simulation

full rationale

The paper derives a decomposition of the normalized residual log-likelihood for Gaussian variance components models, a modified Newton optimizer, and a parametric bootstrap for testing linear contrasts under the null. These steps are constructed from first-principles matrix algebra and simulation from the fitted null model rather than reducing to fitted parameters by construction or self-citation chains. The bootstrap approximates the null distribution via independent draws, not by renaming or smuggling inputs. No self-definitional equivalences, fitted-input predictions, or load-bearing self-citations appear in the central claims. The special-case test for multiple components equaling zero follows directly as an instance of the general linear-contrast procedure without circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard domain assumptions for Gaussian mixed models and bootstrap validity; no free parameters or invented entities are explicitly introduced in the provided text.

axioms (1)

domain assumption Data follows a Gaussian variance components model
The testing procedure and likelihood decomposition are developed specifically for this model as stated in the abstract.

pith-pipeline@v0.9.0 · 5411 in / 1290 out tokens · 87880 ms · 2026-05-07T15:17:21.351611+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Bates, D., Maechler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4.Journal of Statistical Software, 67(1):1–48. Battey, H. and McCullagh, P. (2024). An anomaly arising in the analysis of processes with more than one source of variability.Biometrika, 111(2):677–689. Bezanson, J., Edelman, A., Karpinski, S., and Shah,...

work page 2015
[2]

R Foun- dation for Statistical Computing, Vienna, Austria

R Core Team (2024).R: A Language and Environment for Statistical Computing. R Foun- dation for Statistical Computing, Vienna, Austria. Stram, D. O. and Lee, J. W. (1994). Variance components testing in the longitudinal mixed effects model.Biometrics, pages 1171–1177. Venables, W. N. and Ripley, B. D. (2002). Random and mixed effects. InModern applied stat...

work page 2024
[3]

Yates, F. (1935). Complex experiments.Supplement to the Journal of the Royal Statistical Society, 2(2):181–247. Zhang, Y., Ekvall, K. O., and Molstad, A. J. (2025). Fast and reliable confidence intervals for a variance component.Biometrika, 112(2):asaf010. 42

work page 1935

[1] [1]

Bates, D., Maechler, M., Bolker, B., and Walker, S. (2015). Fitting linear mixed-effects models using lme4.Journal of Statistical Software, 67(1):1–48. Battey, H. and McCullagh, P. (2024). An anomaly arising in the analysis of processes with more than one source of variability.Biometrika, 111(2):677–689. Bezanson, J., Edelman, A., Karpinski, S., and Shah,...

work page 2015

[2] [2]

R Foun- dation for Statistical Computing, Vienna, Austria

R Core Team (2024).R: A Language and Environment for Statistical Computing. R Foun- dation for Statistical Computing, Vienna, Austria. Stram, D. O. and Lee, J. W. (1994). Variance components testing in the longitudinal mixed effects model.Biometrics, pages 1171–1177. Venables, W. N. and Ripley, B. D. (2002). Random and mixed effects. InModern applied stat...

work page 2024

[3] [3]

Yates, F. (1935). Complex experiments.Supplement to the Journal of the Royal Statistical Society, 2(2):181–247. Zhang, Y., Ekvall, K. O., and Molstad, A. J. (2025). Fast and reliable confidence intervals for a variance component.Biometrika, 112(2):asaf010. 42

work page 1935