Better Together: Cross and Joint Covariances Enhance Signal Detectability in Undersampled Data

Arabind Swain; Ilya Nemenman; Sean Alexander Ridout

arxiv: 2507.22207 · v2 · submitted 2025-07-29 · ❄️ cond-mat.dis-nn · cs.LG· physics.data-an· stat.ML

Better Together: Cross and Joint Covariances Enhance Signal Detectability in Undersampled Data

Arabind Swain , Sean Alexander Ridout , Ilya Nemenman This is my paper

Pith reviewed 2026-05-19 03:22 UTC · model grok-4.3

classification ❄️ cond-mat.dis-nn cs.LGphysics.data-anstat.ML

keywords covariance estimationsignal detectionrandom matrix theoryhigh-dimensional statisticsphase transitionscross-correlationsjoint analysisundersampled data

0 comments

The pith

Joint and cross covariance matrices detect shared signals earlier than self-covariances in high-dimensional data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper examines how to best detect a shared signal between two high-dimensional variables when data is limited and noise creates false correlations. Using random matrix theory, it compares three ways to build covariance matrices: separate self-covariances for each variable, the cross-covariance between them, and the joint covariance from concatenating the variables. The key finding is that joint and cross versions allow the signal to be reconstructed at weaker strengths than self-covariances, with the best choice depending on how different the dimensions of the two variables are. This matters for applications like analyzing paired datasets where samples are scarce.

Core claim

In the large-dimension limit with fixed aspect ratios, the Baik-Ben Arous-Péché phase transition for the largest eigenvalue occurs at lower signal strengths for the joint covariance and cross-covariance constructions than for the individual self-covariances. Joint and cross always reconstruct the shared signal earlier, and which one is optimal depends on the mismatch in dimensionalities between the two variables.

What carries the argument

The Baik-Ben Arous-Péché detectability phase transition applied to self, cross, and joint covariance matrices constructed from two high-dimensional variables.

If this is right

Applications involving detection of linear correlations between two high-dimensional measurements can achieve better signal reconstruction by using joint or cross covariances instead of separate self-covariances.
The optimal construction depends on the relative dimensions of the two variables, guiding method choice based on data structure.
These results may generalize to detecting nonlinear statistical dependencies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

In experimental settings with limited samples, pairing measurements in joint analysis could reduce the required signal strength for detection.
Similar principles might apply to more than two variables or other matrix constructions in high-dimensional statistics.
Testing these predictions on synthetic data with controlled signal strengths and dimensions would validate the dimensionality mismatch effect.

Load-bearing premise

The background noise produces Marchenko-Pastur bulk statistics without additional structure in the large-dimension limit with fixed aspect ratios.

What would settle it

Measuring the signal strength at which the largest eigenvalue detaches from the Marchenko-Pastur bulk in simulated or real data for self, cross, and joint covariances and checking if the transition thresholds match the predicted ordering.

Figures

Figures reproduced from arXiv: 2507.22207 by Arabind Swain, Ilya Nemenman, Sean Alexander Ridout.

**Figure 1.** Figure 1: Estimation of X and Y signals using the joint covariance. We fix b = 0.5, qX = 1, qY = 4 (T = 200, NX = 200, NY = 800), such that b < bcrit, and then vary the X signal strength a. As a increases, in numerical simulations, both the X (green squares) and Y (green circles) components of the estimated spike vˆz,joint develop nonzero overlap with the true spike when a 2 + b 2 crosses the threshold ccrit (Eq. 25… view at source ↗

**Figure 2.** Figure 2: Phase diagram for spike detectability from self and joint covariances. Solid green represents the region where a spike results in a detectable outlier in the jointcovariance matrix. In the region with alternate blue and green hatching, outliers are detectable by both methods. For the white region, none of the methods are able to detect a signal. For this plot qX = 1, qY = 4. The dotted lines give the bo… view at source ↗

**Figure 3.** Figure 3: Estimation of X and Y signals using the cross covariance. We fix b = 2.5, qX = 1, qY = 20 (T = 100, NX = 100, NY = 2 × 103 ), such that b < bcrit, and then vary the X signal strength a. As a is increased, in numerical simulations, both vˆx,joint(orange squares) and vˆy,joint (orange circles) develop nonzero overlap with the true spike when ab crosses the threshold, determined semi-analytically. Lines sho… view at source ↗

**Figure 4.** Figure 4: We consider a case where qX ≫ qY , but construct the phase diagram using the exact Eq. (30) (semianalytically). We observe that, in the undersampled regime, when either qX ≫ 1 or qY ≫ 1, the spike is always detectable in cross covariance before it can be detected in both individual self covariances. As for the joint covariance ( [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 4.** Figure 4: Phase diagram for spike detectability for cross and self covariances. We fix qX = 1, qY = 20 (notice that the value of qY is different from [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison between joint and cross overlaps for estimating the spike in Y . We fix b = 2.5, qX = 1, qY = 20 (T = 100, NX = 100, NY = 2 × 103 ) such that b < bcrit, and qY ≫ qX, and then vary the X signal strength a. As a is increased, in numerical simulations, both vˆy,cross (orange circles) and vˆy,cross (green circles) develop nonzero overlap with the true spike vˆy. Colored dashed lines show analytica… view at source ↗

**Figure 7.** Figure 7: Comparison between joint and cross overlaps for the latent feature model. We fix b = 1.5, qX = 1, qY = 20 (T = 100, NX = 100, NY = 2 × 103 ) such that b < bcrit and qY ≫ qX, and then vary the X signal strength a. As a is increased, in numerical simulations, both vˆy,cross (orange circles) and vˆy,cross (green circles) develop nonzero overlap with the true spike vˆy. As in the additive spike model ( [PITH… view at source ↗

read the original abstract

Many data-science applications involve detecting a shared signal between two high-dimensional variables. Using random matrix theory methods, we determine when such signal can be detected and reconstructed from sample correlations, despite the background of sampling noise induced correlations. We consider three different covariance matrices constructed from two high-dimensional variables: their individual self covariance, their cross covariance, and the self covariance of the concatenated (joint) variable, which incorporates the self and the cross correlation blocks. We observe the expected Baik, Ben Arous, and P\'ech\'e detectability phase transition in all these covariance matrices, and we show that joint and cross covariance matrices always reconstruct the shared signal earlier than the self covariances. Whether the joint or the cross approach is better depends on the mismatch of dimensionalities between the variables. We discuss what these observations mean for choosing the right method for detecting linear correlations in data and how these findings may generalize to nonlinear statistical dependencies.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Joint and cross covariances detect shared signals at lower strengths than self-covariances, with the better choice set by the dimension mismatch between the two variables.

read the letter

The main takeaway is that this paper works out when joint or cross covariance beats self-covariance for picking up a shared signal in high-dimensional data with few samples. It extends the Baik-Ben Arous-Péché transition to the cross and joint constructions and shows that both of them cross the detectability threshold before the self-covariance does. Which of the two wins then depends on how the dimensions of the two variables line up relative to each other and to the sample size.

Referee Report

0 major / 2 minor

Summary. The manuscript applies random matrix theory to compare the detectability of a shared linear signal between two high-dimensional variables using three sample covariance constructions: the individual self-covariances, the cross-covariance, and the joint covariance of the concatenated variables. In the large-dimension limit with fixed aspect ratios, the Baik-Ben Arous-Péché phase transition for an outlier eigenvalue is shown to occur at lower signal strengths for the joint and cross constructions than for the self-covariances, with the optimal choice between joint and cross governed by the mismatch in the two variables' dimensions.

Significance. If the asymptotic ordering holds, the result supplies a concrete, RMT-based criterion for selecting among covariance-based detectors in undersampled regimes. The work explicitly invokes the Marchenko-Pastur bulk and BBP transition for all three constructions and derives a dimensionality-mismatch rule that yields falsifiable predictions, which strengthens its practical utility for data-analysis pipelines.

minor comments (2)

[§3] §3: the finite-N corrections to the BBP threshold are noted as future work but not quantified; a short remark on the expected size of the correction for typical aspect ratios would help readers assess applicability to moderate-sized data sets.
[Abstract] Abstract: the precise form of the shared signal (rank-1 spike with strength parameter) is left implicit; adding one sentence would make the central claim immediately scannable.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their positive and constructive report, which accurately captures the core contributions of the manuscript and recommends acceptance. We are encouraged by the recognition of the practical utility of the derived dimensionality-mismatch rule for selecting covariance constructions in undersampled regimes.

read point-by-point responses

Referee: REFEREE SUMMARY: The manuscript applies random matrix theory to compare the detectability of a shared linear signal between two high-dimensional variables using three sample covariance constructions: the individual self-covariances, the cross-covariance, and the joint covariance of the concatenated variables. In the large-dimension limit with fixed aspect ratios, the Baik-Ben Arous-Péché phase transition for an outlier eigenvalue is shown to occur at lower signal strengths for the joint and cross constructions than for the self-covariances, with the optimal choice between joint and cross governed by the mismatch in the two variables' dimensions.

Authors: We appreciate this precise summary of our results. The referee correctly identifies that the BBP transition occurs at lower signal strengths for the joint and cross constructions, with the optimal choice depending on the dimensionality mismatch, which is the central finding we wished to convey. revision: no
Referee: REFEREE SIGNIFICANCE: If the asymptotic ordering holds, the result supplies a concrete, RMT-based criterion for selecting among covariance-based detectors in undersampled regimes. The work explicitly invokes the Marchenko-Pastur bulk and BBP transition for all three constructions and derives a dimensionality-mismatch rule that yields falsifiable predictions, which strengthens its practical utility for data-analysis pipelines.

Authors: We are pleased that the referee highlights the falsifiability and practical implications. The explicit use of the Marchenko-Pastur law and BBP transition for each construction was intended to provide a clear, testable guideline for practitioners choosing among these detectors. revision: no

Circularity Check

0 steps flagged

No significant circularity; derivations rest on external RMT benchmarks

full rationale

The paper derives detectability thresholds for self, cross, and joint covariance matrices by applying the standard Marchenko-Pastur bulk and Baik-Ben Arous-Péché phase transition to a shared-signal model in the large-N limit with fixed aspect ratios. These RMT results are external, pre-existing mathematical facts independent of the present work and are invoked uniformly across the three constructions. The central ordering result (joint and cross detect earlier than self, with joint vs. cross depending on dimensionality mismatch) follows by direct comparison of the resulting BBP thresholds; no equation reduces to a fitted parameter renamed as prediction, no load-bearing premise rests on self-citation, and no ansatz is smuggled via prior author work. The derivation is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard random matrix theory assumptions about noise statistics and the large-N limit; no free parameters are fitted and no new entities are introduced.

axioms (2)

standard math The noise covariance produces Marchenko-Pastur bulk statistics in the large-dimension limit.
Invoked to locate the BBP phase transition for all three covariance matrices.
domain assumption The shared signal is a low-rank perturbation (spike) of fixed strength.
Standard spiked covariance model assumption used throughout the detectability analysis.

pith-pipeline@v0.9.0 · 5706 in / 1352 out tokens · 35917 ms · 2026-05-19T03:22:00.926333+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Optimal Spectral Algorithms for Correlated Two-view Models in High Dimensions
math.ST 2026-05 unverdicted novelty 7.0

Introduces a TAP-motivated framework and constructs explicit parameter-free spectral algorithms that achieve strong detection and weak recovery thresholds in three canonical correlated two-view models with matching lo...
Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data
physics.data-an 2026-04 unverdicted novelty 6.0

DySIB recovers a two-dimensional representation matching the phase space of a physical pendulum from high-dimensional video data by maximizing predictive mutual information in latent space.

Reference graph

Works this paper leans on

39 extracted references · 39 canonical work pages · cited by 2 Pith papers · 1 internal anchor

[1]

A. E. Urai, B. Doiron, A. M. Leifer, and A. K. Church- land, Large-scale neural recordings call for new insights to link brain and behavior, Nature Neuroscience25, 11 (2022)

work page 2022
[2]

A. C. Paulk, Y. Kfir, A. R. Khanna, M. L. Mus- troph, E. M. Trautmann, D. J. Soper, S. D. Stavisky, M. Welkenhuysen, B. Dutta, K. V. Shenoy, L. R. Hochberg, R. M. Richardson, Z. M. Williams, and S. S. Cash, Large-scale neural recordings with single neuron resolution using neuropixels probes in human cortex, Na- ture Neuroscience 25, 252 (2022)

work page 2022
[3]

G. J. Stephens, B. Johnson-Kerner, W. Bialek, and W. S. Ryu, Dimensionality and dynamics in the behavior of c. elegans, PLoS Comput Biol4, e1000028 (2008)

work page 2008
[4]

G. J. Berman, D. M. Choi, W. Bialek, and J. W. Shae- vitz, Mapping the stereotyped behaviour of freely mov- ing fruit flies, Journal of The Royal Society Interface11, 20140672 (2014)

work page 2014
[5]

Huang, X

J. Huang, X. Liang, Y. Xuan, C. Geng, Y. Li, H. Lu, S. Qu, X. Mei, H. Chen, T. Yu, N. Sun, J. Rao, J. Wang, W. Zhang, Y. Chen, S. Liao, H. Jiang, X. Liu, Z. Yang, F. Mu, and S. Gao, A reference human genome dataset of the BGISEQ-500 sequencer, GigaScience 6, gix024 (2017), https://academic.oup.com/gigascience/article- pdf/6/5/gix024/25514714/gix024.pdf

work page 2017
[6]

C. Meng, B. Kuster, A. C. Culhane, and A. M. Gholami, A multivariate approach to the integration of multi-omics datasets, BMC Bioinformatics15, 162 (2014)

work page 2014
[7]

Sinhuber, K

M. Sinhuber, K. Van Der Vaart, R. Ni, J. G. Puckett, D. H. Kelley, and N. T. Ouellette, Three-dimensional time-resolved trajectories from laboratory insect swarms, Scientific Data 6, 1 (2019)

work page 2019
[8]

A. I. Dell, J. A. Bender, K. Branson, I. D. Couzin, G. G. de Polavieja, L. P. Noldus, A. Pérez-Escudero, P. Per- ona, A. D. Straw, M. Wikelski, and U. Brose, Auto- mated image-based tracking and its application in ecol- ogy, Trends in Ecology & Evolution29, 417 (2014)

work page 2014
[9]

Wold, Estimation of principal components and related models by iterative least squares, Multivariate analysis , 391 (1966)

H. Wold, Estimation of principal components and related models by iterative least squares, Multivariate analysis , 391 (1966)

work page 1966
[10]

W. F. Massy, Principal components regression in ex- ploratory statistical research, Journal of the American Statistical Association 60, 234 (1965)

work page 1965
[11]

Hotelling, Analysis of a complex of statistical vari- ables into principal components., Journal of Educational Psychology 24, 498 (1933)

H. Hotelling, Analysis of a complex of statistical vari- ables into principal components., Journal of Educational Psychology 24, 498 (1933)

work page 1933
[12]

Potters and J.-P

M. Potters and J.-P. Bouchaud,A First Course in Ran- dom Matrix Theory: For Physicists, Engineers and Data Scientists (Cambridge University Press, 2020)

work page 2020
[13]

Marchenko and L

V. Marchenko and L. Pastur,Распределение собствен- ных значений в некоторых ансамблях случайных мат- риц[Distribution of eigenvalues for some sets of random matrices], Mat. Sb72, 507 (1967), in Russian

work page 1967
[14]

P. J. Forrester, Eigenvalue statistics for product com- plex wishart matrices, Journal of Physics A: Mathemat- ical and Theoretical47, 345202 (2014)

work page 2014
[15]

Spectral density of products of Wishart dilute random matrices. Part I: the dense case

T. Dupic and I. P. Castillo, Spectral density of products of wishart dilute random matrices. part i: the dense case (2014), arXiv:1401.7802 [cond-mat.dis-nn]

work page internal anchor Pith review Pith/arXiv arXiv 2014
[16]

Fleig and I

P. Fleig and I. Nemenman, Statistical properties of large data sets with linear latent features, Phys. Rev. E106, 014102 (2022)

work page 2022
[17]

J. W. Rocks and P. Mehta, Bias-variance decomposition of overparameterized regression with random linear fea- tures, Phys. Rev. E106, 025304 (2022)

work page 2022
[18]

Burda, A

Z. Burda, A. Jarosz, G. Livan, M. A. Nowak, and A. Swiech, Eigenvalues and singular values of products of rectangular gaussian random matrices, Phys. Rev. E 82, 061114 (2010)

work page 2010
[19]

J. Baik, G. B. Arous, and S. Péché, Phase transition of the largest eigenvalue for nonnull complex sample co- variance matrices, The Annals of Probability 33, 1643 (2005)

work page 2005
[20]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi, The eigenval- ues and eigenvectors of finite, low rank perturbations of large random matrices, Advances in Mathematics227, 494 (2011)

work page 2011
[21]

Abdelaleem, A

E. Abdelaleem, A. Roman, K. M. Martini, and I. Nemen- man, Simultaneous dimensionality reduction: A data ef- ficient approach for multimodal representations learning, Transactions on Machine Learning Research (2024)

work page 2024
[22]

K. M. Martini and I. Nemenman, Data efficiency, dimen- sionality reduction, and the generalized symmetric infor- mation bottleneck, Neural Computation36, 1353 (2024)

work page 2024
[23]

I. M. Johnstone, On the distribution of the largest eigen- value in principal components analysis, The Annals of Statistics 29, 295 (2001)

work page 2001
[24]

Ding and F

X. Ding and F. Yang, Spiked separable covariance matri- ces and principal components, The Annals of Statistics 49, 1113 (2021)

work page 2021
[25]

Ding and H

X. Ding and H. C. Ji, Spiked multiplicative random ma- trices and principal components, Stochastic Processes and their Applications163, 25 (2023)

work page 2023
[26]

I. D. Landau, G. C. Mel, and S. Ganguli, Singular vectors of sums of rectangular random matrices and optimal esti- mation of high-rank signals: The extensive spike model, Phys. Rev. E108, 054129 (2023)

work page 2023
[27]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi, The singular values and vectors of low rank perturbations of large rect- angular random matrices, Journal of Multivariate Anal- ysis 111, 120 (2012)

work page 2012
[28]

Mergny and L

P. Mergny and L. Zdeborova, Private communication (2025)

work page 2025
[29]

Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statistica Sinica 17, 1617 (2007)

D. Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statistica Sinica 17, 1617 (2007)

work page 2007
[30]

Bloemendal, A

A. Bloemendal, A. Knowles, H.-T. Yau, and J. Yin, On the principal components of sample covariance matrices, Probability theory and related fields164, 459 (2016)

work page 2016
[31]

Pourkamali and N

F. Pourkamali and N. Macris, Rectangular rotational in- variant estimator for high-rank matrix estimation (2024), arXiv:2403.04615 [cs.IT]

work page arXiv 2024
[32]

Swain, S

A. Swain, S. A. Ridout, and I. Nemenman, Distribution of singular values in large sample cross-covariance matri- ces (2025), arXiv:2502.05254 [math.ST]

work page arXiv 2025
[33]

Zbontar, L

J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, Barlow twins: Self-supervised learning via redundancy reduction, in International conference on machine learn- ing (PMLR, 2021) pp. 12310–12320

work page 2021
[34]

Radford, J

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., Learning transferable visual models from natural 12 language supervision, inInternational conference on ma- chine learning (PmLR, 2021) pp. 8748–8763

work page 2021
[35]

Rabbat, Y

M.Assran, Q.Duval, I.Misra, P.Bojanowski, P.Vincent, M. Rabbat, Y. LeCun, and N. Ballas, Self-supervised learning from images with a joint-embedding predictive architecture, in Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2023) pp. 15619–15629

work page 2023
[36]

Abdelaleem, I

E. Abdelaleem, I. Nemenman, and K. M. Martini, Deep variational multivariate information bottleneck– a framework for variational losses, arXiv preprint arXiv:2310.03311 (2023)

work page arXiv 2023
[37]

Abdelaleem, K

E. Abdelaleem, K. M. Martini, and I. Nemenman, Ac- curate estimation of mutual information in high dimen- sional data, arXiv preprint arXiv:2506.00330 (2025)

work page arXiv 2025
[38]

Bouchaud, L

J.-P. Bouchaud, L. Laloux, M. A. Miceli, and M. Potters, Large dimension forecasting models and random singular value spectra, The European Physical Journal B55, 201 (2007)

work page 2007
[39]

Keup and L

C. Keup and L. Zdeborová, Optimal thresholds and al- gorithms for a model of multi-modal learning in high di- mensions, arXiv preprint arXiv:2407.03522 (2024)

work page arXiv 2024

[1] [1]

A. E. Urai, B. Doiron, A. M. Leifer, and A. K. Church- land, Large-scale neural recordings call for new insights to link brain and behavior, Nature Neuroscience25, 11 (2022)

work page 2022

[2] [2]

A. C. Paulk, Y. Kfir, A. R. Khanna, M. L. Mus- troph, E. M. Trautmann, D. J. Soper, S. D. Stavisky, M. Welkenhuysen, B. Dutta, K. V. Shenoy, L. R. Hochberg, R. M. Richardson, Z. M. Williams, and S. S. Cash, Large-scale neural recordings with single neuron resolution using neuropixels probes in human cortex, Na- ture Neuroscience 25, 252 (2022)

work page 2022

[3] [3]

G. J. Stephens, B. Johnson-Kerner, W. Bialek, and W. S. Ryu, Dimensionality and dynamics in the behavior of c. elegans, PLoS Comput Biol4, e1000028 (2008)

work page 2008

[4] [4]

G. J. Berman, D. M. Choi, W. Bialek, and J. W. Shae- vitz, Mapping the stereotyped behaviour of freely mov- ing fruit flies, Journal of The Royal Society Interface11, 20140672 (2014)

work page 2014

[5] [5]

Huang, X

J. Huang, X. Liang, Y. Xuan, C. Geng, Y. Li, H. Lu, S. Qu, X. Mei, H. Chen, T. Yu, N. Sun, J. Rao, J. Wang, W. Zhang, Y. Chen, S. Liao, H. Jiang, X. Liu, Z. Yang, F. Mu, and S. Gao, A reference human genome dataset of the BGISEQ-500 sequencer, GigaScience 6, gix024 (2017), https://academic.oup.com/gigascience/article- pdf/6/5/gix024/25514714/gix024.pdf

work page 2017

[6] [6]

C. Meng, B. Kuster, A. C. Culhane, and A. M. Gholami, A multivariate approach to the integration of multi-omics datasets, BMC Bioinformatics15, 162 (2014)

work page 2014

[7] [7]

Sinhuber, K

M. Sinhuber, K. Van Der Vaart, R. Ni, J. G. Puckett, D. H. Kelley, and N. T. Ouellette, Three-dimensional time-resolved trajectories from laboratory insect swarms, Scientific Data 6, 1 (2019)

work page 2019

[8] [8]

A. I. Dell, J. A. Bender, K. Branson, I. D. Couzin, G. G. de Polavieja, L. P. Noldus, A. Pérez-Escudero, P. Per- ona, A. D. Straw, M. Wikelski, and U. Brose, Auto- mated image-based tracking and its application in ecol- ogy, Trends in Ecology & Evolution29, 417 (2014)

work page 2014

[9] [9]

Wold, Estimation of principal components and related models by iterative least squares, Multivariate analysis , 391 (1966)

H. Wold, Estimation of principal components and related models by iterative least squares, Multivariate analysis , 391 (1966)

work page 1966

[10] [10]

W. F. Massy, Principal components regression in ex- ploratory statistical research, Journal of the American Statistical Association 60, 234 (1965)

work page 1965

[11] [11]

Hotelling, Analysis of a complex of statistical vari- ables into principal components., Journal of Educational Psychology 24, 498 (1933)

H. Hotelling, Analysis of a complex of statistical vari- ables into principal components., Journal of Educational Psychology 24, 498 (1933)

work page 1933

[12] [12]

Potters and J.-P

M. Potters and J.-P. Bouchaud,A First Course in Ran- dom Matrix Theory: For Physicists, Engineers and Data Scientists (Cambridge University Press, 2020)

work page 2020

[13] [13]

Marchenko and L

V. Marchenko and L. Pastur,Распределение собствен- ных значений в некоторых ансамблях случайных мат- риц[Distribution of eigenvalues for some sets of random matrices], Mat. Sb72, 507 (1967), in Russian

work page 1967

[14] [14]

P. J. Forrester, Eigenvalue statistics for product com- plex wishart matrices, Journal of Physics A: Mathemat- ical and Theoretical47, 345202 (2014)

work page 2014

[15] [15]

Spectral density of products of Wishart dilute random matrices. Part I: the dense case

T. Dupic and I. P. Castillo, Spectral density of products of wishart dilute random matrices. part i: the dense case (2014), arXiv:1401.7802 [cond-mat.dis-nn]

work page internal anchor Pith review Pith/arXiv arXiv 2014

[16] [16]

Fleig and I

P. Fleig and I. Nemenman, Statistical properties of large data sets with linear latent features, Phys. Rev. E106, 014102 (2022)

work page 2022

[17] [17]

J. W. Rocks and P. Mehta, Bias-variance decomposition of overparameterized regression with random linear fea- tures, Phys. Rev. E106, 025304 (2022)

work page 2022

[18] [18]

Burda, A

Z. Burda, A. Jarosz, G. Livan, M. A. Nowak, and A. Swiech, Eigenvalues and singular values of products of rectangular gaussian random matrices, Phys. Rev. E 82, 061114 (2010)

work page 2010

[19] [19]

J. Baik, G. B. Arous, and S. Péché, Phase transition of the largest eigenvalue for nonnull complex sample co- variance matrices, The Annals of Probability 33, 1643 (2005)

work page 2005

[20] [20]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi, The eigenval- ues and eigenvectors of finite, low rank perturbations of large random matrices, Advances in Mathematics227, 494 (2011)

work page 2011

[21] [21]

Abdelaleem, A

E. Abdelaleem, A. Roman, K. M. Martini, and I. Nemen- man, Simultaneous dimensionality reduction: A data ef- ficient approach for multimodal representations learning, Transactions on Machine Learning Research (2024)

work page 2024

[22] [22]

K. M. Martini and I. Nemenman, Data efficiency, dimen- sionality reduction, and the generalized symmetric infor- mation bottleneck, Neural Computation36, 1353 (2024)

work page 2024

[23] [23]

I. M. Johnstone, On the distribution of the largest eigen- value in principal components analysis, The Annals of Statistics 29, 295 (2001)

work page 2001

[24] [24]

Ding and F

X. Ding and F. Yang, Spiked separable covariance matri- ces and principal components, The Annals of Statistics 49, 1113 (2021)

work page 2021

[25] [25]

Ding and H

X. Ding and H. C. Ji, Spiked multiplicative random ma- trices and principal components, Stochastic Processes and their Applications163, 25 (2023)

work page 2023

[26] [26]

I. D. Landau, G. C. Mel, and S. Ganguli, Singular vectors of sums of rectangular random matrices and optimal esti- mation of high-rank signals: The extensive spike model, Phys. Rev. E108, 054129 (2023)

work page 2023

[27] [27]

Benaych-Georges and R

F. Benaych-Georges and R. R. Nadakuditi, The singular values and vectors of low rank perturbations of large rect- angular random matrices, Journal of Multivariate Anal- ysis 111, 120 (2012)

work page 2012

[28] [28]

Mergny and L

P. Mergny and L. Zdeborova, Private communication (2025)

work page 2025

[29] [29]

Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statistica Sinica 17, 1617 (2007)

D. Paul, Asymptotics of sample eigenstructure for a large dimensional spiked covariance model, Statistica Sinica 17, 1617 (2007)

work page 2007

[30] [30]

Bloemendal, A

A. Bloemendal, A. Knowles, H.-T. Yau, and J. Yin, On the principal components of sample covariance matrices, Probability theory and related fields164, 459 (2016)

work page 2016

[31] [31]

Pourkamali and N

F. Pourkamali and N. Macris, Rectangular rotational in- variant estimator for high-rank matrix estimation (2024), arXiv:2403.04615 [cs.IT]

work page arXiv 2024

[32] [32]

Swain, S

A. Swain, S. A. Ridout, and I. Nemenman, Distribution of singular values in large sample cross-covariance matri- ces (2025), arXiv:2502.05254 [math.ST]

work page arXiv 2025

[33] [33]

Zbontar, L

J. Zbontar, L. Jing, I. Misra, Y. LeCun, and S. Deny, Barlow twins: Self-supervised learning via redundancy reduction, in International conference on machine learn- ing (PMLR, 2021) pp. 12310–12320

work page 2021

[34] [34]

Radford, J

A. Radford, J. W. Kim, C. Hallacy, A. Ramesh, G. Goh, S. Agarwal, G. Sastry, A. Askell, P. Mishkin, J. Clark, et al., Learning transferable visual models from natural 12 language supervision, inInternational conference on ma- chine learning (PmLR, 2021) pp. 8748–8763

work page 2021

[35] [35]

Rabbat, Y

M.Assran, Q.Duval, I.Misra, P.Bojanowski, P.Vincent, M. Rabbat, Y. LeCun, and N. Ballas, Self-supervised learning from images with a joint-embedding predictive architecture, in Proceedings of the IEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (2023) pp. 15619–15629

work page 2023

[36] [36]

Abdelaleem, I

E. Abdelaleem, I. Nemenman, and K. M. Martini, Deep variational multivariate information bottleneck– a framework for variational losses, arXiv preprint arXiv:2310.03311 (2023)

work page arXiv 2023

[37] [37]

Abdelaleem, K

E. Abdelaleem, K. M. Martini, and I. Nemenman, Ac- curate estimation of mutual information in high dimen- sional data, arXiv preprint arXiv:2506.00330 (2025)

work page arXiv 2025

[38] [38]

Bouchaud, L

J.-P. Bouchaud, L. Laloux, M. A. Miceli, and M. Potters, Large dimension forecasting models and random singular value spectra, The European Physical Journal B55, 201 (2007)

work page 2007

[39] [39]

Keup and L

C. Keup and L. Zdeborová, Optimal thresholds and al- gorithms for a model of multi-modal learning in high di- mensions, arXiv preprint arXiv:2407.03522 (2024)

work page arXiv 2024