Regularized estimation for highly multivariate spatial Gaussian random fields

Francisco Cuevas-Pacheco; Gabriel Riffo; Xavier Emery

arxiv: 2604.07507 · v1 · submitted 2026-04-08 · 📊 stat.ME · stat.CO

Regularized estimation for highly multivariate spatial Gaussian random fields

Francisco Cuevas-Pacheco , Gabriel Riffo , Xavier Emery This is my paper

Pith reviewed 2026-05-10 17:32 UTC · model grok-4.3

classification 📊 stat.ME stat.CO

keywords LASSO regularizationMatérn covariancemultivariate spatial statisticsCholesky factorizationsparse covarianceGaussian random fieldsregularized estimationspatial prediction

0 comments

The pith

A LASSO penalty on the Cholesky factor of the multivariate Matérn correlation matrix identifies uncorrelated variable pairs in spatial Gaussian random fields while preserving positive semidefiniteness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a LASSO-penalized estimation method for the parameters of highly multivariate spatial Gaussian random fields. It applies the penalty to the Cholesky factor of the correlation matrix to induce sparsity that sets many cross-covariances to zero. This is solved using a projected block coordinate descent algorithm that maintains the positive semidefiniteness of the covariance at each step. The method is demonstrated in simulations to recover sparse structures and lower estimation error, and applied to a geochemical dataset with 36 variables where standard methods cannot be used.

Core claim

The authors propose a regularized framework for estimating the covariance parameters of a multivariate Matérn model by penalizing the Cholesky factor of the correlation matrix with a LASSO term. This induces sparsity corresponding to uncorrelated pairs of variables. Estimation proceeds through a projected block coordinate descent algorithm that decomposes the problem and enforces the positive semidefiniteness constraint via projections. The approach supports both full likelihood and composite likelihood, with discussion of regularization parameter selection.

What carries the argument

LASSO penalization applied to the Cholesky factor of the multivariate Matérn correlation matrix, optimized by projected block coordinate descent

If this is right

The method automatically identifies which variable pairs have zero cross-correlation without manual intervention.
Estimation error is reduced relative to unpenalized maximum likelihood when the true structure is sparse.
Spatial prediction becomes feasible for large numbers of variables such as p=36.
The framework works for both likelihood and composite likelihood estimation.
Positive semidefiniteness is guaranteed by the projection steps in the algorithm.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The recovered sparse patterns might correspond to meaningful conditional independences in the underlying spatial processes.
The technique could be extended to other covariance models or combined with low-rank approximations for even higher dimensions.
It may help in domains like geochemistry or environmental monitoring where many variables are measured but not all interact spatially.

Load-bearing premise

Penalizing the entries of the Cholesky factor produces a sparse structure that correctly identifies the uncorrelated variable pairs in the true spatial dependence without biasing the estimates of the remaining parameters.

What would settle it

A simulation study in which the true correlation matrix has known zero entries between certain variables, but the penalized estimator either fails to set those entries to zero or produces a matrix that is not positive semidefinite.

Figures

Figures reproduced from arXiv: 2604.07507 by Francisco Cuevas-Pacheco, Gabriel Riffo, Xavier Emery.

**Figure 2.** Figure 2: Cross-variograms of the simulated random field components. [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗

**Figure 3.** Figure 3: Percentage of zeros with respect to λ. Left: Matrix L. Right: Matrix Ψ. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗

**Figure 4.** Figure 4: Solution paths with respect to λ. Each colored line represents an entry of the matrix. Left: Matrix L. Right: Matrix Ψ. 0 50 100 150 200 7500 7700 7900 8100 [PITH_FULL_IMAGE:figures/full_fig_p019_4.png] view at source ↗

**Figure 5.** Figure 5: AIC with respect to λ. In red, the optimal λ value. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_5.png] view at source ↗

**Figure 6.** Figure 6: Cokriging predictions for the 5 random field components. [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Cokriging variance for the third random field component. [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗

**Figure 8.** Figure 8: Percentage of correct zero identifications in matrix [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗

**Figure 9.** Figure 9: Boxplot of the percentage of zeros in the estimated [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

**Figure 10.** Figure 10: Root mean square error (RMSE) of the estimated [PITH_FULL_IMAGE:figures/full_fig_p023_10.png] view at source ↗

**Figure 11.** Figure 11: Spatial locations of the surface samples, along with rock types (Figure obtained [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗

**Figure 12.** Figure 12: Solution paths of covariances of the variables of interest with respect to other [PITH_FULL_IMAGE:figures/full_fig_p028_12.png] view at source ↗

**Figure 13.** Figure 13: Left: Percentage of zeros in matrix L relative to λ. Center: Percentage of zeros in matrix Ψ relative to λ. Right: CLIC criterion relative to λ. The optimal λ is shown in red. 28 [PITH_FULL_IMAGE:figures/full_fig_p028_13.png] view at source ↗

**Figure 14.** Figure 14: Cokriging predictions for copper (Cu). (a) Training data. (b) Test data. (c) [PITH_FULL_IMAGE:figures/full_fig_p029_14.png] view at source ↗

**Figure 15.** Figure 15: Spatial distribution of the variables of interest after normal score transformation. [PITH_FULL_IMAGE:figures/full_fig_p050_15.png] view at source ↗

**Figure 16.** Figure 16: Empirical cross-variograms of the variables of interest with the other variables. [PITH_FULL_IMAGE:figures/full_fig_p051_16.png] view at source ↗

**Figure 17.** Figure 17: Sample variograms of the variables of interest. [PITH_FULL_IMAGE:figures/full_fig_p052_17.png] view at source ↗

read the original abstract

Estimating covariance parameters for multivariate spatial Gaussian random fields is computationally challenging, as the number of parameters grows rapidly with the number of variables, and likelihood evaluation requires operations of order $\mathcal{O}((np)^3)$. In many applications, however, not all cross-dependencies between variables are relevant, suggesting that sparse covariance structures may be both statistically advantageous and practically necessary. We propose a LASSO-penalized estimation framework that induces sparsity in the Cholesky factor of the multivariate Mat\'{e}rn correlation matrix, enabling automatic identification of uncorrelated variable pairs while preserving positive semidefiniteness. Estimation is carried out via a projected block coordinate descent algorithm that decomposes the optimization into tractable subproblems, with constraints enforced at each iteration through appropriate projections. Regularization parameter selection is discussed for both the likelihood and composite likelihood approaches. We conduct a simulation study demonstrating the ability of the method to recover sparse correlation structures and reduce estimation error relative to unpenalized approaches. We illustrate our procedure through an application to a geochemical dataset with $p = 36$ variables and $n = 3998$ spatial locations, showing the practical impact of the method and making spatial prediction feasible in a setting where standard approaches fail entirely.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Penalizing the Cholesky factor of the multivariate Matérn correlation does not deliver sparse cross-correlations or identify uncorrelated variable pairs.

read the letter

The punchline on this paper is that penalizing the Cholesky factor of the multivariate Matérn correlation matrix does not produce sparse cross-correlations, so the claim about identifying uncorrelated variable pairs does not hold up. What is new is the specific use of LASSO on the Cholesky entries combined with a projected block coordinate descent algorithm for the spatial setting. This keeps the matrix positive semidefinite through projections while allowing regularization. The paper shows this can make estimation feasible for p around 36, where the full likelihood becomes intractable. The simulation demonstrates reduced estimation error compared to unpenalized fits and some ability to recover sparse patterns. The real-data example with geochemical measurements illustrates a case where the method runs and enables prediction while standard approaches cannot. The main soft spot is the mismatch between sparsity in L and sparsity in R = L L^T. Zeroing parts of L through the penalty does not guarantee zero entries in the correlation matrix itself. This breaks the direct link to uncorrelated pairs. The stress-test concern is valid based on the abstract and the stated claim. If the full paper has additional arguments or different interpretation, that would change things, but as presented the central statistical property is not delivered. This work is for researchers in multivariate spatial statistics who need practical ways to handle many variables. A reader looking for algorithmic ideas in penalized spatial covariance estimation would get something out of the optimization details and the application. It deserves a serious referee. The computational problem is important and the method is concrete enough to review properly. I recommend sending it for peer review so the interpretation issue can be clarified or corrected in revision.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes a LASSO-penalized likelihood framework for estimating covariance parameters in highly multivariate spatial Gaussian random fields under the multivariate Matérn model. Sparsity is induced in the Cholesky factor L of the p × p correlation matrix R = L L^T via penalized estimation, with positive semidefiniteness preserved by a projected block coordinate descent algorithm. The approach is claimed to enable automatic identification of uncorrelated variable pairs. Regularization parameter selection is discussed for both full and composite likelihoods. A simulation study is presented to show recovery of sparse structures and reduced estimation error, together with a real-data application to a geochemical dataset (p = 36 variables, n = 3998 locations) where unpenalized methods are infeasible.

Significance. If the central mapping from sparsity in L to interpretable uncorrelated pairs holds without distorting the Matérn parameters or spatial dependence, the method would address a genuine computational barrier in high-dimensional spatial statistics and make likelihood inference practical for large p. The projected block coordinate descent and composite-likelihood option are pragmatic strengths. However, the interpretation gap between L-sparsity and R-sparsity (zero cross-covariances) limits the strength of the claims until addressed.

major comments (1)

[Abstract] Abstract: the central claim that the LASSO penalty on the Cholesky factor 'enables automatic identification of uncorrelated variable pairs' does not follow from the construction. Because R = L L^T, zeroing selected entries of L does not force the corresponding entries of R to zero; nonzero cross terms can remain. The simulation claim of 'recovering sparse correlation structures' therefore rests on an unverified assumption that L-sparsity produces the intended R-sparsity. This is load-bearing for the paper's interpretation and must be corrected or demonstrated explicitly (e.g., by reporting the sparsity pattern of the estimated R and verifying that cross-covariance functions are effectively zero).

minor comments (1)

The abstract states that regularization parameter selection is discussed for both likelihood and composite likelihood, but the concrete criteria (e.g., BIC, cross-validation, or information criteria) and their finite-sample behavior should be stated more explicitly.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful and constructive review of our manuscript. The major comment highlights an important distinction between sparsity in the Cholesky factor L and exact sparsity in the correlation matrix R. We address this point directly below and commit to revisions that clarify the claims and provide explicit verification.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that the LASSO penalty on the Cholesky factor 'enables automatic identification of uncorrelated variable pairs' does not follow from the construction. Because R = L L^T, zeroing selected entries of L does not force the corresponding entries of R to zero; nonzero cross terms can remain. The simulation claim of 'recovering sparse correlation structures' therefore rests on an unverified assumption that L-sparsity produces the intended R-sparsity. This is load-bearing for the paper's interpretation and must be corrected or demonstrated explicitly (e.g., by reporting the sparsity pattern of the estimated R and verifying that cross-covariance functions are effectively zero).

Authors: We agree that exact zeros in L do not mathematically guarantee exact zeros in R, since the (i,j) entry of R is the dot product of the i-th and j-th rows of L. Our approach uses the Cholesky parameterization primarily to enforce positive definiteness during penalized estimation, while the LASSO penalty on L encourages many zero entries that, in practice, drive the corresponding cross terms in R to be very small. This enables effective identification of pairs with negligible cross-dependence for the purposes of high-dimensional spatial modeling. Nevertheless, we acknowledge that the original wording in the abstract overstated the direct implication. In the revised manuscript we will (i) rephrase the abstract to state that the penalized Cholesky approach induces sparsity in L and thereby facilitates recovery of approximately sparse correlation structures, and (ii) add explicit verification in the simulation section by reporting the sparsity patterns (number of exact or near-zero entries) of both the estimated L and R matrices, together with numerical checks that the cross-covariance functions for the identified pairs fall below a small threshold (e.g., 0.01). These additions will directly address the referee's suggestion and strengthen the interpretability of the results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is a standard penalized likelihood with independent parameterization

full rationale

The paper introduces a LASSO penalty on the Cholesky factor of the multivariate Matérn correlation matrix and uses projected block coordinate descent for estimation. No step reduces by construction to a fitted parameter renamed as a prediction, nor does any central claim rely on a self-citation chain that is itself unverified or load-bearing. The claim that sparsity in the Cholesky factor enables identification of uncorrelated pairs is a methodological assertion about the procedure's effect rather than an equivalence derived from the paper's own equations. The simulation and application sections provide empirical support outside the fitting process itself. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The method rests on standard positive-semidefiniteness of Matérn models and the modeling assumption that Cholesky sparsity corresponds to uncorrelated pairs; the main free parameter is the regularization strength.

free parameters (1)

LASSO regularization parameter
The penalty strength lambda, selected via penalized likelihood or composite likelihood as described in the abstract.

axioms (2)

standard math Multivariate Matérn correlation functions produce positive semidefinite matrices
Required for the covariance structure to remain valid after sparsification.
domain assumption Sparsity in the Cholesky factor identifies uncorrelated variable pairs
Central modeling choice enabling automatic identification of irrelevant cross-dependencies.

pith-pipeline@v0.9.0 · 5516 in / 1513 out tokens · 104274 ms · 2026-05-10T17:32:30.194045+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

SinceU N(θ) andK(θ, θ 0) (defined in Equation (17)) are compositions of continuous functions, they are themselves continuous

By conditions (C2) and (C3), the mappingθ7→Q(h;θ) is continuous with continuous derivative. SinceU N(θ) andK(θ, θ 0) (defined in Equation (17)) are compositions of continuous functions, they are themselves continuous

work page
[2]

Therefore, ˜UN converges in probability toK(θ, θ 0) (defined in (17)), which has a unique minimum atθ 0 by condition C4

The weak law of large numbers together with condition C1 implies that, asN→ ∞, bQ(h)→Q(h;θ 0) in probability. Therefore, ˜UN converges in probability toK(θ, θ 0) (defined in (17)), which has a unique minimum atθ 0 by condition C4. 43

work page
[3]

1 2 n−1X k=1 nX l=k+1 tr Q−1 kl ∂Qkl ∂θh1 Q−1 kl ∂Qkl ∂θh2 #q h1,h2=1 (29) J(θ) =

Define the modulus of continuity of ˜UN(θ) as WN(η) = sup ∥α−β∥≤η ˜UN(α)− ˜UN(β) , α, β∈Θ.(18) Then we have ˜UN(α)− ˜UN(β) ≤ X h∈M log |Q(h;α)| |Q(h;β)| + tr bQ(h) Q−1(h;α)−Q −1(h;β) .(19) By conditions (C2) and (C3), the functionsθ7→Q(h;θ) are Lipschitz; then, by composition of Lipschitz functions, we have X h∈M log |Q(h;α)| |Q(h;β)| ≤ " L1 + X h∈M L2 # ...

work page 2016

[1] [1]

SinceU N(θ) andK(θ, θ 0) (defined in Equation (17)) are compositions of continuous functions, they are themselves continuous

By conditions (C2) and (C3), the mappingθ7→Q(h;θ) is continuous with continuous derivative. SinceU N(θ) andK(θ, θ 0) (defined in Equation (17)) are compositions of continuous functions, they are themselves continuous

work page

[2] [2]

Therefore, ˜UN converges in probability toK(θ, θ 0) (defined in (17)), which has a unique minimum atθ 0 by condition C4

The weak law of large numbers together with condition C1 implies that, asN→ ∞, bQ(h)→Q(h;θ 0) in probability. Therefore, ˜UN converges in probability toK(θ, θ 0) (defined in (17)), which has a unique minimum atθ 0 by condition C4. 43

work page

[3] [3]

1 2 n−1X k=1 nX l=k+1 tr Q−1 kl ∂Qkl ∂θh1 Q−1 kl ∂Qkl ∂θh2 #q h1,h2=1 (29) J(θ) =

Define the modulus of continuity of ˜UN(θ) as WN(η) = sup ∥α−β∥≤η ˜UN(α)− ˜UN(β) , α, β∈Θ.(18) Then we have ˜UN(α)− ˜UN(β) ≤ X h∈M log |Q(h;α)| |Q(h;β)| + tr bQ(h) Q−1(h;α)−Q −1(h;β) .(19) By conditions (C2) and (C3), the functionsθ7→Q(h;θ) are Lipschitz; then, by composition of Lipschitz functions, we have X h∈M log |Q(h;α)| |Q(h;β)| ≤ " L1 + X h∈M L2 # ...

work page 2016