Introduction to the non-asymptotic analysis of random matrices

Roman Vershynin

Authors on Pith no claims yet

classification 🧮 math.PR cs.NAmath.FA

keywords matricesanalysisrandomtheoreticalapplicationsbasiccomputerfunctional

read the original abstract

This is a tutorial on some basic non-asymptotic methods and concepts in random matrix theory. The reader will learn several tools for the analysis of the extreme singular values of random matrices with independent rows or columns. Many of these methods sprung off from the development of geometric functional analysis since the 1970's. They have applications in several fields, most notably in theoretical computer science, statistics and signal processing. A few basic applications are covered in this text, particularly for the problem of estimating covariance matrices in statistics and for validating probabilistic constructions of measurement matrices in compressed sensing. These notes are written particularly for graduate students and beginning researchers in different areas, including functional analysts, probabilists, theoretical statisticians, electrical engineers, and theoretical computer scientists.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Transformers Learn the Optimal DDPM Denoiser for Multi-Token GMMs
cs.LG 2026-04 unverdicted novelty 8.0

Transformers converge globally to the optimal DDPM denoiser for multi-token GMMs via self-attention mean denoising, with explicit token and iteration requirements.
Scaling Laws from Sequential Feature Recovery: A Solvable Hierarchical Model
stat.ML 2026-05 accept novelty 7.0

A solvable hierarchical model with power-law feature strengths yields explicit power-law scaling of prediction error through sequential recovery of latent directions by a layer-wise spectral algorithm.
Locally Near Optimal Piecewise Linear Regression in High Dimensions via Difference of Max-Affine Functions
stat.ML 2026-05 unverdicted novelty 7.0

ABGD parametrizes piecewise linear functions as difference of max-affine functions and converges linearly to an epsilon-accurate solution with O(d max(sigma/epsilon,1)^2) samples under sub-Gaussian noise, which is min...
Transformers Efficiently Perform In-Context Logistic Regression via Normalized Gradient Descent
cs.LG 2026-05 conditional novelty 7.0

Multi-layer transformers can implement in-context logistic regression by performing normalized gradient descent steps layer by layer, obtained via supervised training of a single attention layer followed by recurrent ...
Optimal Semiparametric Dynamic Pricing with Feature Diversity
stat.ME 2026-05 unverdicted novelty 7.0

A stagewise greedy algorithm for semiparametric contextual dynamic pricing achieves regret T to the max of 1/2 and 3 over (2 beta plus 1) for linear m, with a matching lower bound proving optimality.
Adaptive Estimation and Inference in Semi-parametric Heterogeneous Clustered Multitask Learning via Neyman Orthogonality
stat.ML 2026-05 unverdicted novelty 7.0

An adaptive fused orthogonal estimator recovers latent clusters exactly with high probability and achieves pooled parametric rates plus asymptotic normality matching an oracle in semiparametric heterogeneous clustered...
Linear Regression for Panel With Unknown Number of Factors as Interactive Fixed Effects
econ.EM 2026-05 unverdicted novelty 7.0

The limiting distribution of the LS estimator in panel models with interactive fixed effects is invariant to over-specifying the number of factors.
Sliced Inner Product Gromov-Wasserstein Distances
stat.ML 2026-05 unverdicted novelty 6.0

A sliced IGW distance is introduced with closed-form 1D expressions, rotational invariance, and studied structural and computational properties for efficient data alignment.
Efficient Proposal-Test-Release for Minimax Optimal Estimation
stat.ME 2026-05 unverdicted novelty 6.0

Efficient PTR replaces exact insensitive sets and Hellinger distances with simpler subsets and Lipschitz lower bounds to achieve minimax-optimal accuracy for DP Bayes classification, linear regression, and nonparametr...
Transfer Learning for Degree-Corrected Mixed Membership Network Models
stat.ME 2026-04 unverdicted novelty 6.0

Transfer learning from informative source networks improves target DCMM estimation accuracy by enlarging the eigenvalue gap of the connection probability matrix, with algorithms to avoid negative transfer.
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
cs.LG 2024-01 unverdicted novelty 6.0

SPIN lets weak LLMs become strong by self-generating training data from previous model versions and training to prefer human-annotated responses over its own outputs, outperforming DPO even with extra GPT-4 data on be...
Anchored Spectral Estimator for Rigid Motion Synchronization
math.OC 2026-04 unverdicted novelty 5.0

ASE is a spectral method for rigid motion synchronization that delivers uniform estimation error bounds and outperforms two-stage rotation-then-translation approaches on synthetic and registration tasks.