pith. machine review for the scientific record. sign in

arxiv: 1609.05573 · v2 · submitted 2016-09-19 · 🧮 math.ST · cs.DS· cs.IT· math.IT· math.PR· stat.ML· stat.TH

Recognition: unknown

Optimality and Sub-optimality of PCA for Spiked Random Matrices and Synchronization

Authors on Pith no claims yet
classification 🧮 math.ST cs.DScs.ITmath.ITmath.PRstat.MLstat.TH
keywords thresholdensemblematrixrandomachievesanalysispriorsspike
0
0 comments X
read the original abstract

A central problem of random matrix theory is to understand the eigenvalues of spiked random matrix models, in which a prominent eigenvector is planted into a random matrix. These distributions form natural statistical models for principal component analysis (PCA) problems throughout the sciences. Baik, Ben Arous and P\'ech\'e showed that the spiked Wishart ensemble exhibits a sharp phase transition asymptotically: when the signal strength is above a critical threshold, it is possible to detect the presence of a spike based on the top eigenvalue, and below the threshold the top eigenvalue provides no information. Such results form the basis of our understanding of when PCA can detect a low-rank signal in the presence of noise. However, not all the information about the spike is necessarily contained in the spectrum. We study the fundamental limitations of statistical methods, including non-spectral ones. Our results include: I) For the Gaussian Wigner ensemble, we show that PCA achieves the optimal detection threshold for a variety of benign priors for the spike. We extend previous work on the spherically symmetric and i.i.d. Rademacher priors through an elementary, unified analysis. II) For any non-Gaussian Wigner ensemble, we show that PCA is always suboptimal for detection. However, a variant of PCA achieves the optimal threshold (for benign priors) by pre-transforming the matrix entries according to a carefully designed function. This approach has been stated before, and we give a rigorous and general analysis. III) For both the Gaussian Wishart ensemble and various synchronization problems over groups, we show that inefficient procedures can work below the threshold where PCA succeeds, whereas no known efficient algorithm achieves this. This conjectural gap between what is statistically possible and what can be done efficiently remains open.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Mixing times of Langevin dynamics for spiked matrix models

    math.PR 2026-04 unverdicted novelty 6.0

    For spiked Wigner matrices, Langevin dynamics mixes in O(log N) time from uniform or top-eigenvector-symmetric starts below the critical inverse temperature 1/θ, while worst-case mixing is exponential in N with rate e...

  2. Algorithmic Contiguity from Low-Degree Heuristic II: Predicting Detection-Recovery Gaps

    math.ST 2026-04 unverdicted novelty 6.0

    A model-independent framework converts mild low-degree testing advantages into conditional computational lower bounds for recovery tasks, recovering prior results for planted submatrix and SBM while providing new evid...