High-dimensional principal component analysis with heterogeneous missingness

Richard J. Samworth; Tengyao Wang; Ziwei Zhu

arxiv: 1906.12125 · v1 · pith:VHM32ULWnew · submitted 2019-06-28 · 📊 stat.ME · math.ST· stat.TH

High-dimensional principal component analysis with heterogeneous missingness

Ziwei Zhu , Tengyao Wang , Richard J. Samworth This is my paper

Pith reviewed 2026-05-25 13:55 UTC · model grok-4.3

classification 📊 stat.ME math.STstat.TH

keywords high-dimensional PCAmissing dataheterogeneous missingnessiterative imputationincoherence conditiongeometric convergencesingular value decomposition

0 comments

The pith

primePCA recovers principal components from high-dimensional data with heterogeneous missingness at a geometric rate under an incoherence condition in the noiseless case.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces primePCA, an iterative method for high-dimensional PCA when entries may be missing heterogeneously. It begins with an inverse-probability weighted estimator, then repeatedly projects the observed entries onto the current column-space estimate to fill in missing values and recomputes the leading right singular vectors of the completed matrix. In the noiseless setting the method is shown to drive estimation error to zero geometrically, provided the principal components satisfy an incoherence condition with respect to the missingness pattern and the signal strength is not too weak. The convergence rates depend only on average properties of the missingness mechanism rather than its most adverse realizations. This matters because standard weighted estimators perform poorly or fail to recover the components exactly under realistic heterogeneous missingness.

Core claim

In the noiseless case, under an incoherence condition on the principal components, the error of primePCA converges to zero at a geometric rate when the signal strength is not too small. The theoretical guarantees depend on average rather than worst-case properties of the missingness mechanism. The interaction between heterogeneity of missingness and low-dimensional structure determines feasibility of the problem.

What carries the argument

primePCA, an iterative procedure that imputes missing entries by projecting observed data onto the current estimate of the column space and then updates the estimate from the leading right singular vectors of the imputed matrix.

If this is right

In homogeneous missingness with constant-order noise an inverse-probability weighted estimator nearly attains the minimax rate.
primePCA succeeds at exact recovery in the noiseless heterogeneous case where the inverse-probability weighted estimator fails.
The feasibility of recovery is governed by the interaction between missingness heterogeneity and the low-rank structure rather than by worst-case missingness rates.
Numerical experiments on simulated and real data show strong performance across a wide range of heterogeneous missingness scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The incoherence condition could be checked in practice by measuring alignment between estimated components and the observed missingness pattern.
Similar iterative projection schemes may extend to matrix completion or other low-rank recovery tasks with structured missingness.
The reliance on average missingness properties suggests the method remains stable even when a small fraction of rows or columns have very high missing rates.

Load-bearing premise

The principal components are not too aligned with the missingness pattern.

What would settle it

A noiseless simulation in which the principal components are highly aligned with the missingness pattern shows that the error of primePCA fails to decrease geometrically.

Figures

Figures reproduced from arXiv: 1906.12125 by Richard J. Samworth, Tengyao Wang, Ziwei Zhu.

**Figure 1.** Figure 1: Estimates of EL(Vb prime K , VK) for various choices of σ∗ under (H1) in the noiseless setting of Section 4.1 (left) and (H2) in the noisy setting of Section 4.2 with ν = 20 (right). 4.1 Noiseless case In the noiseless setting, we set Z = 0, and also fix n = 2000, d = 500, K = 2 and Σu = 100I2. We set VK = r 1 500 1250 1250 1250 −1250 ∈ R 500×2 . 15 [PITH_FULL_IMAGE:figures/full_fig_p015_1.png] view at source ↗

**Figure 2.** Figure 2: Logarithms of the average Frobenius norm sin Θ error of [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗

**Figure 3.** Figure 3: Leading eigenvalues of Σb y. We are now in a position to analyse the data using primePCA. For i = 1, . . . , n = 110,000 and j = 1, . . . , d = 1,777, let Yij ∈ {1, . . . , 5} denote the level of interest of user i in song j, let Kb = 10 and let I = {i : kωik1 > Kb}. Our initial goal is to assess the top Kb eigenvalues of Σy to see if there is low-rank signal in Y = (Yij ). To this end, we first apply Algo… view at source ↗

**Figure 4.** Figure 4: Plots of the first two principal components [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗

read the original abstract

We study the problem of high-dimensional Principal Component Analysis (PCA) with missing observations. In simple, homogeneous missingness settings with a noise level of constant order, we show that an existing inverse-probability weighted (IPW) estimator of the leading principal components can (nearly) attain the minimax optimal rate of convergence. However, deeper investigation reveals both that, particularly in more realistic settings where the missingness mechanism is heterogeneous, the empirical performance of the IPW estimator can be unsatisfactory, and moreover that, in the noiseless case, it fails to provide exact recovery of the principal components. Our main contribution, then, is to introduce a new method for high-dimensional PCA, called `primePCA', that is designed to cope with situations where observations may be missing in a heterogeneous manner. Starting from the IPW estimator, primePCA iteratively projects the observed entries of the data matrix onto the column space of our current estimate to impute the missing entries, and then updates our estimate by computing the leading right singular space of the imputed data matrix. It turns out that the interaction between the heterogeneity of missingness and the low-dimensional structure is crucial in determining the feasibility of the problem. We therefore introduce an incoherence condition on the principal components and prove that in the noiseless case, the error of primePCA converges to zero at a geometric rate when the signal strength is not too small. An important feature of our theoretical guarantees is that they depend on average, as opposed to worst-case, properties of the missingness mechanism. Our numerical studies on both simulated and real data reveal that primePCA exhibits very encouraging performance across a wide range of scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

primePCA gives a practical iterative refinement of IPW for heterogeneous missingness in high-dim PCA, with geometric convergence to exact recovery in the noiseless case under incoherence and average missingness properties.

read the letter

primePCA is the main new contribution here. It starts with the inverse-probability weighted estimator and then iteratively imputes missing entries by projecting the observed data onto the column space of the current estimate, followed by updating via the leading singular vectors of the completed matrix. The paper proves that in the noiseless case, under an incoherence condition on the principal components and with sufficient signal strength, the error of this procedure converges geometrically to zero. A notable feature is that the theoretical guarantees rely on average properties of the missingness mechanism rather than worst-case ones.

Referee Report

0 major / 2 minor

Summary. The paper studies high-dimensional PCA under heterogeneous missing observations. It shows that an existing IPW estimator nearly attains the minimax rate in homogeneous missingness with constant noise but performs poorly under heterogeneity and fails to achieve exact recovery in the noiseless case. The main contribution is primePCA, which initializes with the IPW estimator and iteratively imputes missing entries by projecting observed data onto the column space of the current estimate before updating via the leading right singular vectors of the completed matrix. Under an incoherence condition on the principal components and a lower bound on signal strength, the method is proved to converge geometrically to exact recovery in the noiseless case; the rates depend on average (not worst-case) properties of the missingness mechanism. Numerical experiments on simulated and real data demonstrate strong empirical performance across scenarios.

Significance. If the stated conditions hold, the work is significant for addressing realistic heterogeneous missingness in high-dimensional PCA by exploiting the interaction between low-rank structure and missingness patterns. The geometric convergence result in the noiseless case and the reliance on average missingness properties (rather than worst-case) are notable theoretical strengths. The iterative refinement of the IPW initializer is a clear methodological advance, and the empirical results across a range of settings provide supporting evidence. The paper explicitly builds on the existing IPW estimator without circularity in the rates.

minor comments (2)

[§3] §3 (method description): the update step in Algorithm 1 could explicitly state how the singular vectors are normalized or scaled to ensure the iteration is well-defined when the imputed matrix has rank deficiency.
[numerical studies] The numerical section would benefit from reporting the exact values of the incoherence parameter and signal strength used in the simulations to allow direct comparison with the theoretical thresholds.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive and accurate summary of our work, as well as for the recommendation of minor revision. We appreciate the recognition of the theoretical and methodological contributions of primePCA.

Circularity Check

0 steps flagged

No significant circularity; derivation self-contained under stated assumptions

full rationale

The paper starts from an existing IPW estimator (cited as prior work) and defines primePCA as an iterative refinement whose geometric convergence in the noiseless case is proved under an explicit incoherence condition on the principal components plus a signal-strength lower bound. These are external assumptions, not derived from the algorithm itself. The rates are shown to depend on average missingness properties rather than worst-case, but this is a proved property of the iteration, not a tautology or self-citation reduction. No step equates a claimed prediction or rate to a fitted quantity or prior self-result by construction. The central claim therefore retains independent mathematical content.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on an incoherence condition (treated as a domain assumption) and the noiseless setting; no free parameters or invented entities are described in the abstract.

axioms (1)

domain assumption Incoherence condition on the principal components (they are not too aligned with the missingness pattern)
Required for the geometric convergence statement in the noiseless case

pith-pipeline@v0.9.0 · 5833 in / 1418 out tokens · 19126 ms · 2026-05-25T13:55:44.913345+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 49 canonical work pages · 2 internal anchors

[1]

Anderson, T. W. (1957) Maximum likelihood estimates for a multivariate normal distribution when some observations are missing. J. Amer. Statist. Assoc., 52, 200--203

work page 1957
[2]

and Tsybakov, A

Belloni, A., Rosenbaum, M. and Tsybakov, A. B. (2017) Linear and conic programming estimators in high dimensional errors-in-variables models. J. Roy. Statist. Soc., Ser. B, 79, 939--956

work page 2017
[3]

and Massart, P

Boucheron, S., Lugosi, G. and Massart, P. (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford

work page 2013
[4]

Cai, T. T. and Zhang, A. (2018a) Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics. Ann. Statist., 46, 60--89

work page
[5]

Cai, T. T. and Zhang, L. (2018b) High-dimensional linear discriminant analysis: optimality, adaptive algorithm, and missing data. arXiv:1804.03018

work page internal anchor Pith review Pith/arXiv arXiv
[6]

J., Li, X., Ma, Y

Cand\`es, E. J., Li, X., Ma, Y. and Wright, J. (2011) Robust principal component analysis? J. ACM, 58, 11:1--11:37

work page 2011
[7]

Cand\`es, E. J. and Plan, Y. (2010) Matrix completion with noise. Proc. IEEE, 98, 925--936

work page 2010
[8]

Cand\`es, E. J. and Recht, B. (2009) Exact matrix completion via convex optimization. Found. Comput. Math., 9, 717--772

work page 2009
[9]

and Priebe, C

Cape, J., Tang, M. and Priebe, C. E. (2018) The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics. Ann. Statist., to appear

work page 2018
[10]

and Chen Y

Chi Y., Lu Y. and Chen Y. (2018) Nonconvex optimization meets low-rank matrix factorization: An overview. arXiv preprint arXiv:1809.09573

work page arXiv 2018
[11]

and Rohe, K

Cho, J., Kim, D. and Rohe, K. (2017) Asymptotic theory for estimating the singular vectors and values of a partially-observed low rank matrix with noise. Statist. Sinica, 27, 1921--1948

work page 2017
[12]

P., Laird, N

Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc., Ser. B, 39, 1--38

work page 1977
[13]

and Josse J

Dray, S. and Josse J. (2015) Principal component analysis with missing values: a comparative survey of methods. Plant Ecol., 216, 657--667

work page 2015
[14]

Sparse spectral estimation with missing and corrupted measurements

Elsener, A and van de Geer, S. (2018) Sparse spectral estimation with missing and corrupted measurements. arXiv:1811.10443

work page internal anchor Pith review Pith/arXiv arXiv 2018
[15]

Large covariance estimation by thresholding principal orthogonal complements. J. Roy. Statist. Soc., Ser. B, 75, 603--680

work page
[16]

Ford, B. L. (1983) An overview of hot-deck procedures. In W. G. Madow, I. Olkin and D. B. Rubin (Eds.) Incomplete Data in Sample Surveys, Vol. 2: Theory and Bibliographies, 185--207. Academic Press, New York

work page 1983
[17]

Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2016) Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res., 18, 1--45

work page 2016
[18]

Hastie T., Mazumder R., Lee J. D. and Zadeh R. (2015) Matrix completion and low-rank SVD via fast alternating least squares. J. Mach. Learn. Res., 16, 3367--3402

work page 2015
[19]

and Sharakhmetov, S

Ibragimov, R. and Sharakhmetov, S. (1998) On an exact constant for the Rosenthal inequality. Theory Probab. Appl., 42, 294--302

work page 1998
[20]

Johnstone, I. M. and Lu, A. Y. (2009) On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc., 104, 682--693

work page 2009
[21]

and Husson F

Josse J. and Husson F. (2012) Handling missing values in exploratory multivariate data analysis methods. J. de la Soci\'et\'e Fran caise de Statistique, 153, 1--21

work page 2012
[22]

and Husson F

Josse J., Pag\`es J. and Husson F. (2009) Gestion des donne\'es manquantes en analyse en composantes principales. J. de la Soci\'et\'e Fran caise de Statistique, 150, 28--51

work page 2009
[23]

H., Montanari, A

Keshavan, R. H., Montanari, A. and Oh, S. (2010) Matrix completion from a few entries. IEEE Trans. Inform. Theory, 56, 2980--2998

work page 2010
[24]

Kiers H. A. L. (1997) Weighted least squares fitting using ordinary least squares algorithms. Psychometrika, 62, 251--266

work page 1997
[25]

and Tsybakov, A

Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011) Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist., 39, 2302--2329

work page 2011
[26]

Little, R. J. and Rubin, D. B. (2014) Statistical Analysis with Missing Data, John Wiley & Sons, Hoboken

work page 2014
[27]

and Tan, X

Loh, P.-L. and Tan, X. L. (2018) High-dimensional robust precision matrix estimation: Cellwise corruption under -contamination. Electron. J. Statist., 12, 1429--1467

work page 2018
[28]

and Wainwright, M

Loh, P.-L. and Wainwright, M. J. (2012) High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity. Ann. Statist., 40, 1637--1664

work page 2012
[29]

(2013) Sparse principal component analysis with missing observations

Lounici, K. (2013) Sparse principal component analysis with missing observations. In C. Houdr\'e et al. (Eds.) High Dimensional Probability VI, 327--356. Birkh\"auser, Basel

work page 2013
[30]

(2014) High-dimensional covariance matrix estimation with missing observations

Lounici, K. (2014) High-dimensional covariance matrix estimation with missing observations. Bernoulli, 20, 1029--1058

work page 2014
[31]

(2007) Concentration Inequalities and Model Selection, Springer, Berlin

Massart, P. (2007) Concentration Inequalities and Model Selection, Springer, Berlin

work page 2007
[32]

and Tibshirani R

Mazumder, R., Hastie, T. and Tibshirani R. (2010) Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res., 11, 2287--2322

work page 2010
[33]

and Wainwright, M

Negahban, S. and Wainwright, M. J. (2012) Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res., 13, 1665--1697

work page 2012
[34]

and Yu, B

Rohe, K., Chatterjee, S. and Yu, B. (2011) Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist., 39, 1878--1915

work page 2011
[35]

and Tsybakov, A

Rosenbaum, M. and Tsybakov, A. B. (2010) Sparse recovery under matrix uncertainty. Ann. Statist., 38, 2620--2651

work page 2010
[36]

(1970) On the subspaces of L_p ( p> 2 ) spanned by sequences of independent random variables

Rosenthal, H.P. (1970) On the subspaces of L_p ( p> 2 ) spanned by sequences of independent random variables. Israel Journal of Mathematics, 8(3), 273--303

work page 1970
[37]

Rubin, D. B. (2004) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Hoboken

work page 2004
[38]

and Marron, J

Shen, D., Shen, H., Zhu, H. and Marron, J. (2016) The statistics and mathematics of high dimension low sample size asymptotics. Statist. Sinica, 26, 1747--1770

work page 2016
[39]

Tropp, J. A. (2012) User-friendly tail bounds for sums of random matrices. Found. Comput. Math., 12, 389--434

work page 2012
[40]

van der Vaart, A. W. and Wellner, J. A. (1996) Weak Convergence and Empirical Processes. Springer, New York

work page 1996
[41]

(2012) Introduction to the non-asymptotic analysis of random matrices

Vershynin, R. (2012) Introduction to the non-asymptotic analysis of random matrices. In Y. Eldar and G. Kutyniok (Eds.) Compressed Sensing, Theory and Applications. Cambridge University Press, Cambridge. 210--268

work page 2012
[42]

and Plan, Y

Wang, T., Berthet, Q. and Plan, Y. (2016) Average-case hardness of RIP certification. Adv. Neural. Inf. Process. Syst., 29

work page 2016
[43]

Wainwright, M. J. (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Cambridge

work page 2019
[44]

and Fan, J

Wang, W. and Fan, J. (2017) Asymptotics of empirical eigen-structure for ultra-high dimensional spiked covariance model. Ann. Statist., 45, 1342--1374

work page 2017
[45]

and Lyttkens, E

Wold, H. and Lyttkens, E. (1969) Nonlinear iterative partial least squares (NIPALS) estimation procedures. Bull. Int. Stat. Inst., 43, 29--51

work page 1969
[46]

(1997) Assouad, Fano and Le Cam

Yu, B. (1997) Assouad, Fano and Le Cam. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, Pollard, D., Torgersen, E. and Yang G. L. (Eds.), 423--435. Springer, New York

work page 1997
[47]

and Samworth, R

Yu, Y., Wang, T. and Samworth, R. J. (2015) A useful variant of the Davis--Kahan theorem for statisticians. Biometrika, 102, 315--323

work page 2015
[48]

Zhang, A., Cai, T. T. and Wu, Y. (2018) Heteroskedastic PCA: Algorithm, optimality, and applications. arXiv:1810.08316

work page arXiv 2018
[49]

and Samworth, R

Zhu, Z., Wang, T. and Samworth, R. J. (2019) primePCA : projected refinement for imputation of missing entries in principal component analysis. R package version 1.0. https://cran.r-project.org/web/packages/primePCA/

work page 2019

[1] [1]

Anderson, T. W. (1957) Maximum likelihood estimates for a multivariate normal distribution when some observations are missing. J. Amer. Statist. Assoc., 52, 200--203

work page 1957

[2] [2]

and Tsybakov, A

Belloni, A., Rosenbaum, M. and Tsybakov, A. B. (2017) Linear and conic programming estimators in high dimensional errors-in-variables models. J. Roy. Statist. Soc., Ser. B, 79, 939--956

work page 2017

[3] [3]

and Massart, P

Boucheron, S., Lugosi, G. and Massart, P. (2013) Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Oxford

work page 2013

[4] [4]

Cai, T. T. and Zhang, A. (2018a) Rate-optimal perturbation bounds for singular subspaces with applications to high-dimensional statistics. Ann. Statist., 46, 60--89

work page

[5] [5]

Cai, T. T. and Zhang, L. (2018b) High-dimensional linear discriminant analysis: optimality, adaptive algorithm, and missing data. arXiv:1804.03018

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

J., Li, X., Ma, Y

Cand\`es, E. J., Li, X., Ma, Y. and Wright, J. (2011) Robust principal component analysis? J. ACM, 58, 11:1--11:37

work page 2011

[7] [7]

Cand\`es, E. J. and Plan, Y. (2010) Matrix completion with noise. Proc. IEEE, 98, 925--936

work page 2010

[8] [8]

Cand\`es, E. J. and Recht, B. (2009) Exact matrix completion via convex optimization. Found. Comput. Math., 9, 717--772

work page 2009

[9] [9]

and Priebe, C

Cape, J., Tang, M. and Priebe, C. E. (2018) The two-to-infinity norm and singular subspace geometry with applications to high-dimensional statistics. Ann. Statist., to appear

work page 2018

[10] [10]

and Chen Y

Chi Y., Lu Y. and Chen Y. (2018) Nonconvex optimization meets low-rank matrix factorization: An overview. arXiv preprint arXiv:1809.09573

work page arXiv 2018

[11] [11]

and Rohe, K

Cho, J., Kim, D. and Rohe, K. (2017) Asymptotic theory for estimating the singular vectors and values of a partially-observed low rank matrix with noise. Statist. Sinica, 27, 1921--1948

work page 2017

[12] [12]

P., Laird, N

Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977) Maximum likelihood from incomplete data via the EM algorithm. J. Roy. Statist. Soc., Ser. B, 39, 1--38

work page 1977

[13] [13]

and Josse J

Dray, S. and Josse J. (2015) Principal component analysis with missing values: a comparative survey of methods. Plant Ecol., 216, 657--667

work page 2015

[14] [14]

Sparse spectral estimation with missing and corrupted measurements

Elsener, A and van de Geer, S. (2018) Sparse spectral estimation with missing and corrupted measurements. arXiv:1811.10443

work page internal anchor Pith review Pith/arXiv arXiv 2018

[15] [15]

Large covariance estimation by thresholding principal orthogonal complements. J. Roy. Statist. Soc., Ser. B, 75, 603--680

work page

[16] [16]

Ford, B. L. (1983) An overview of hot-deck procedures. In W. G. Madow, I. Olkin and D. B. Rubin (Eds.) Incomplete Data in Sample Surveys, Vol. 2: Theory and Bibliographies, 185--207. Academic Press, New York

work page 1983

[17] [17]

Gao, C., Ma, Z., Zhang, A. Y. and Zhou, H. H. (2016) Achieving optimal misclassification proportion in stochastic block models. J. Mach. Learn. Res., 18, 1--45

work page 2016

[18] [18]

Hastie T., Mazumder R., Lee J. D. and Zadeh R. (2015) Matrix completion and low-rank SVD via fast alternating least squares. J. Mach. Learn. Res., 16, 3367--3402

work page 2015

[19] [19]

and Sharakhmetov, S

Ibragimov, R. and Sharakhmetov, S. (1998) On an exact constant for the Rosenthal inequality. Theory Probab. Appl., 42, 294--302

work page 1998

[20] [20]

Johnstone, I. M. and Lu, A. Y. (2009) On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc., 104, 682--693

work page 2009

[21] [21]

and Husson F

Josse J. and Husson F. (2012) Handling missing values in exploratory multivariate data analysis methods. J. de la Soci\'et\'e Fran caise de Statistique, 153, 1--21

work page 2012

[22] [22]

and Husson F

Josse J., Pag\`es J. and Husson F. (2009) Gestion des donne\'es manquantes en analyse en composantes principales. J. de la Soci\'et\'e Fran caise de Statistique, 150, 28--51

work page 2009

[23] [23]

H., Montanari, A

Keshavan, R. H., Montanari, A. and Oh, S. (2010) Matrix completion from a few entries. IEEE Trans. Inform. Theory, 56, 2980--2998

work page 2010

[24] [24]

Kiers H. A. L. (1997) Weighted least squares fitting using ordinary least squares algorithms. Psychometrika, 62, 251--266

work page 1997

[25] [25]

and Tsybakov, A

Koltchinskii, V., Lounici, K. and Tsybakov, A. B. (2011) Nuclear-norm penalization and optimal rates for noisy low-rank matrix completion. Ann. Statist., 39, 2302--2329

work page 2011

[26] [26]

Little, R. J. and Rubin, D. B. (2014) Statistical Analysis with Missing Data, John Wiley & Sons, Hoboken

work page 2014

[27] [27]

and Tan, X

Loh, P.-L. and Tan, X. L. (2018) High-dimensional robust precision matrix estimation: Cellwise corruption under -contamination. Electron. J. Statist., 12, 1429--1467

work page 2018

[28] [28]

and Wainwright, M

Loh, P.-L. and Wainwright, M. J. (2012) High-dimensional regression with noisy and missing data: provable guarantees with nonconvexity. Ann. Statist., 40, 1637--1664

work page 2012

[29] [29]

(2013) Sparse principal component analysis with missing observations

Lounici, K. (2013) Sparse principal component analysis with missing observations. In C. Houdr\'e et al. (Eds.) High Dimensional Probability VI, 327--356. Birkh\"auser, Basel

work page 2013

[30] [30]

(2014) High-dimensional covariance matrix estimation with missing observations

Lounici, K. (2014) High-dimensional covariance matrix estimation with missing observations. Bernoulli, 20, 1029--1058

work page 2014

[31] [31]

(2007) Concentration Inequalities and Model Selection, Springer, Berlin

Massart, P. (2007) Concentration Inequalities and Model Selection, Springer, Berlin

work page 2007

[32] [32]

and Tibshirani R

Mazumder, R., Hastie, T. and Tibshirani R. (2010) Spectral regularization algorithms for learning large incomplete matrices. J. Mach. Learn. Res., 11, 2287--2322

work page 2010

[33] [33]

and Wainwright, M

Negahban, S. and Wainwright, M. J. (2012) Restricted strong convexity and weighted matrix completion: optimal bounds with noise. J. Mach. Learn. Res., 13, 1665--1697

work page 2012

[34] [34]

and Yu, B

Rohe, K., Chatterjee, S. and Yu, B. (2011) Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist., 39, 1878--1915

work page 2011

[35] [35]

and Tsybakov, A

Rosenbaum, M. and Tsybakov, A. B. (2010) Sparse recovery under matrix uncertainty. Ann. Statist., 38, 2620--2651

work page 2010

[36] [36]

(1970) On the subspaces of L_p ( p> 2 ) spanned by sequences of independent random variables

Rosenthal, H.P. (1970) On the subspaces of L_p ( p> 2 ) spanned by sequences of independent random variables. Israel Journal of Mathematics, 8(3), 273--303

work page 1970

[37] [37]

Rubin, D. B. (2004) Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Hoboken

work page 2004

[38] [38]

and Marron, J

Shen, D., Shen, H., Zhu, H. and Marron, J. (2016) The statistics and mathematics of high dimension low sample size asymptotics. Statist. Sinica, 26, 1747--1770

work page 2016

[39] [39]

Tropp, J. A. (2012) User-friendly tail bounds for sums of random matrices. Found. Comput. Math., 12, 389--434

work page 2012

[40] [40]

van der Vaart, A. W. and Wellner, J. A. (1996) Weak Convergence and Empirical Processes. Springer, New York

work page 1996

[41] [41]

(2012) Introduction to the non-asymptotic analysis of random matrices

Vershynin, R. (2012) Introduction to the non-asymptotic analysis of random matrices. In Y. Eldar and G. Kutyniok (Eds.) Compressed Sensing, Theory and Applications. Cambridge University Press, Cambridge. 210--268

work page 2012

[42] [42]

and Plan, Y

Wang, T., Berthet, Q. and Plan, Y. (2016) Average-case hardness of RIP certification. Adv. Neural. Inf. Process. Syst., 29

work page 2016

[43] [43]

Wainwright, M. J. (2019) High-Dimensional Statistics: A Non-Asymptotic Viewpoint. Cambridge University Press, Cambridge

work page 2019

[44] [44]

and Fan, J

Wang, W. and Fan, J. (2017) Asymptotics of empirical eigen-structure for ultra-high dimensional spiked covariance model. Ann. Statist., 45, 1342--1374

work page 2017

[45] [45]

and Lyttkens, E

Wold, H. and Lyttkens, E. (1969) Nonlinear iterative partial least squares (NIPALS) estimation procedures. Bull. Int. Stat. Inst., 43, 29--51

work page 1969

[46] [46]

(1997) Assouad, Fano and Le Cam

Yu, B. (1997) Assouad, Fano and Le Cam. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, Pollard, D., Torgersen, E. and Yang G. L. (Eds.), 423--435. Springer, New York

work page 1997

[47] [47]

and Samworth, R

Yu, Y., Wang, T. and Samworth, R. J. (2015) A useful variant of the Davis--Kahan theorem for statisticians. Biometrika, 102, 315--323

work page 2015

[48] [48]

Zhang, A., Cai, T. T. and Wu, Y. (2018) Heteroskedastic PCA: Algorithm, optimality, and applications. arXiv:1810.08316

work page arXiv 2018

[49] [49]

and Samworth, R

Zhu, Z., Wang, T. and Samworth, R. J. (2019) primePCA : projected refinement for imputation of missing entries in principal component analysis. R package version 1.0. https://cran.r-project.org/web/packages/primePCA/

work page 2019