pith. sign in

arxiv: 2604.06823 · v2 · pith:CSHQOOAGnew · submitted 2026-04-08 · 🧮 math.PR

On spectrum of sample correlation matrices from large fold tensor vectors

Pith reviewed 2026-05-10 18:18 UTC · model grok-4.3

classification 🧮 math.PR
keywords limiting spectral distributionMarčenko-Pastur lawsample correlation matrixtensor productWishart matrixrandom matriceshigh-dimensional statistics
0
0 comments X

The pith

Sample correlation matrices from k-fold tensor vectors converge to the Marčenko-Pastur law when k grows slower than n.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper shows that the limiting spectral distribution of sample correlation matrices, built from k-fold tensor products of n-dimensional vectors with i.i.d. entries, is the Marčenko-Pastur law. The setting requires n and k to infinity with k much smaller than n. The same Marčenko-Pastur limit holds for the Wishart matrix formed from k-fold tensor products of independent uniform unit vectors on the complex sphere. A reader cares because tensor products create structured high-dimensional data, and the result indicates that this structure does not alter the bulk eigenvalue behavior familiar from classical random matrices.

Core claim

We show that the limiting spectral distribution is the Marčenko-Pastur law for the sample correlation matrix whose sample vectors are k-fold tensor products of n-dimensional vectors with i.i.d. entries, in the regime n,k to infinity with k=o(n). As a consequence, the limiting spectral distribution of the Wishart matrix from the k-fold tensor product of independent uniformly distributed unit vectors in C^n is the Marčenko-Pastur law.

What carries the argument

The k-fold tensor product structure on the sample vectors, which under the scaling k=o(n) reduces the limiting spectrum to the standard Marčenko-Pastur law.

Load-bearing premise

The tensor fold order k must grow slower than the base dimension n, and the original vectors must have i.i.d. entries or be uniform on the unit sphere.

What would settle it

Numerical computation of the empirical spectral density for sequences with n=2000 and k=20 that visibly fails to match the Marčenko-Pastur density would contradict the claim.

read the original abstract

In this paper, we investigate the limiting spectral distribution of the sample correlation matrix, whose sample vectors are $k$-fold tensor products of $n$-dimensional vectors with i.i.d. entries. We focus on the limiting regime $n,k \to \infty$ with $k = o(n)$, and we show that the limiting spectral distribution is the Mar\v{c}enko-Pastur law. As a consequence, we show that the limiting spectral distribution of the Whishart matrix from the $k$-fold tensor product of independent uniformly distributed unit vectors in $\mathbb C^n$ is the Mar\v{c}enko-Pastur law.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper investigates the limiting spectral distribution of sample correlation matrices whose vectors are k-fold tensor products of n-dimensional i.i.d. entry vectors, in the regime n,k→∞ with k=o(n). It claims this LSD is the Marčenko-Pastur law, and as a consequence the same holds for the Wishart matrix formed from k-fold tensor products of independent uniform unit vectors in ℂ^n.

Significance. If the central claim is established with full technical control, the result would show that the Marčenko-Pastur law remains insensitive to the coordinate-wise dependencies created by tensor products when the fold depth grows slower than dimension. This extends the reach of classical RMT results to a structured setting that arises in multilinear models and tensor data analysis. The paper correctly isolates the k=o(n) window in which overlaps become rare, but the strength of the contribution hinges on whether the moment or Stieltjes-transform analysis actually closes under these dependencies.

major comments (2)
  1. [Main theorem / proof of LSD] The main result (stated in the abstract and presumably Theorem 1 or 2) asserts that the LSD is exactly the Marčenko-Pastur law. However, the tensor product structure induces non-zero covariances between coordinates whose multi-indices share at least one factor. While k=o(n) makes the probability of a single overlap O(k/n)→0, the manuscript must show that all higher-order diagram contributions (e.g., in the fourth-moment or trace-moment expansion) remain o(1) uniformly in the number of samples. No such explicit bound or vanishing argument is supplied.
  2. [Corollary on Wishart matrices] The consequence for the Wishart matrix of tensorized unit vectors likewise relies on the same limiting law. Because the base vectors are constrained to the unit sphere, additional dependence is introduced; the paper does not verify that this spherical constraint preserves the moment calculations already used for the i.i.d. case.
minor comments (2)
  1. [Abstract] The abstract contains the typographical error “Whishart” (should be “Wishart”) and inconsistent spelling of “Marčenko-Pastur”.
  2. [Title and abstract] The title uses “correlated matrices” while the abstract and body refer to “sample correlation matrix”; consistent terminology would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address the two major points below and will revise the manuscript accordingly to provide the requested explicit controls.

read point-by-point responses
  1. Referee: The main result (stated in the abstract and presumably Theorem 1 or 2) asserts that the LSD is exactly the Marčenko-Pastur law. However, the tensor product structure induces non-zero covariances between coordinates whose multi-indices share at least one factor. While k=o(n) makes the probability of a single overlap O(k/n)→0, the manuscript must show that all higher-order diagram contributions (e.g., in the fourth-moment or trace-moment expansion) remain o(1) uniformly in the number of samples. No such explicit bound or vanishing argument is supplied.

    Authors: We agree that the current write-up would be strengthened by an explicit vanishing argument for higher-order terms. In the revision we will add a dedicated subsection to the proof of the main theorem that performs a full combinatorial enumeration of the diagrams arising in the fourth-moment (and higher) expansions. Using the k=o(n) hypothesis we bound the total contribution of all overlap diagrams by a quantity that is O(k/n + (k/n)^2 + …) and therefore tends to zero uniformly in the number of samples; the argument relies only on counting the number of ways to choose shared indices and on the independence of the underlying scalar entries. revision: yes

  2. Referee: The consequence for the Wishart matrix of tensorized unit vectors likewise relies on the same limiting law. Because the base vectors are constrained to the unit sphere, additional dependence is introduced; the paper does not verify that this spherical constraint preserves the moment calculations already used for the i.i.d. case.

    Authors: We acknowledge that the spherical constraint introduces extra dependence not present in the i.i.d. setting. In the revised manuscript we will insert a short lemma immediately preceding the corollary that shows the difference between the normalized and un-normalized moment sequences is o(1). The argument uses the fact that each base vector’s Euclidean norm concentrates around its expectation (by standard concentration for sums of i.i.d. entries) together with the same overlap-counting estimates already employed for the main theorem; because k=o(n) the tensor-product structure does not amplify the normalization error. This establishes that the limiting law carries over verbatim to the unit-vector case. revision: yes

Circularity Check

0 steps flagged

No circularity: standard RMT derivation applied to tensor-structured vectors

full rationale

The paper derives the limiting spectral distribution of the sample correlation matrix from k-fold tensor products of i.i.d.-entry vectors by applying classical random matrix techniques (moment method or Stieltjes transform) under the stated regime n,k→∞ with k=o(n). The Marčenko-Pastur law emerges as the limit because the tensor-induced coordinate dependencies become negligible in the high-dimensional limit, with no steps that reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations. The derivation is self-contained against external benchmarks of random matrix theory and does not rename known results or smuggle ansatzes via prior work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Abstract-only review prevents exhaustive identification; the paper appears to rest on standard domain assumptions of random matrix theory without introducing new free parameters or invented entities.

axioms (2)
  • domain assumption Vectors have i.i.d. entries or are uniformly distributed on the unit sphere in C^n
    Stated in the abstract as the source of the sample vectors.
  • domain assumption The limiting regime n, k → ∞ with k = o(n) is sufficient for convergence to the Marchenko-Pastur law
    Central growth condition invoked to obtain the limiting distribution.

pith-pipeline@v0.9.0 · 5391 in / 1520 out tokens · 54460 ms · 2026-05-10T18:18:45.655757+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.