A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

Alberto Del Pia; Dekun Zhou

arxiv: 2507.09148 · v2 · pith:ROHAIZOPnew · submitted 2025-07-12 · 📊 stat.ML · cs.LG· math.OC

A Randomized Algorithm for Sparse PCA based on the Basic SDP Relaxation

Alberto Del Pia , Dekun Zhou This is my paper

Pith reviewed 2026-05-22 00:00 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.OC

keywords sparse PCASDP relaxationrandomized algorithmapproximation ratiodimensionality reductionhigh-dimensional statisticsmachine learning optimization

0 comments

The pith

A randomized algorithm for sparse PCA using SDP relaxation achieves an approximation ratio bounded by the sparsity constant with high probability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a randomized approximation algorithm for Sparse Principal Component Analysis, a key method for reducing dimensions in data with many features that is computationally hard. The approach takes an approximate solution from the basic semidefinite programming relaxation, creates one deterministic sparse vector and several randomized ones through rounding, and selects the best result. It proves that the approximation ratio is at most the sparsity constant with high probability when the algorithm is invoked sufficiently often. Additionally, under a technical assumption that holds when the SDP solution is low-rank or has eigenvalues that decay exponentially, the average approximation ratio is bounded by O(log d) where d is the number of features. This provides a practical way to solve an otherwise intractable problem in high-dimensional statistics and machine learning.

Core claim

The algorithm constructs one deterministic sparse solution and several randomized solutions from an approximate SDP solution for SPCA and outputs the best among them. This guarantees an approximation ratio of at most the sparsity constant with high probability if called enough times. Under the technical assumption satisfied for low-rank SDP solutions or those with exponentially decaying eigenvalues, the expected approximation ratio is O(log d). The paper also shows the assumption holds for two classes of instances and that in a generalized covariance model the deterministic solution is near-optimal.

What carries the argument

The combination of a deterministic construction and randomized rounding applied to an approximate solution of the basic SDP relaxation, followed by selecting the best among the generated sparse vectors.

If this is right

Running the algorithm multiple times yields a sparse vector whose objective value is within the sparsity constant of the optimum with high probability.
In covariance models that generalize the spiked Wishart model, the deterministic solution alone achieves a near-optimal approximation ratio.
The guarantees apply to any SDP solution that is low-rank or has exponentially decaying eigenvalues.
Numerical tests on real-world datasets confirm that the algorithm produces competitive sparse principal components in practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same rounding strategy could be tested on SDP relaxations for related problems like sparse regression or clustering to see if similar ratio bounds emerge.
Relaxing the eigenvalue decay condition to polynomial decay might extend the O(log d) average bound to a wider class of SDP solutions.
Practical implementations could combine this method with warm-starting from simpler heuristics for even faster runtime on large datasets.
The near-optimality result in the generalized covariance model suggests the approach may be particularly strong when the underlying data has a clear low-rank signal plus noise.

Load-bearing premise

The SDP solution must satisfy a technical condition such as being low-rank or having exponentially decaying eigenvalues for the average approximation ratio to be bounded by O(log d).

What would settle it

An SDP solution for an SPCA instance with slowly decaying eigenvalues where repeated randomized rounding yields average approximation ratios worse than O(log d) would disprove the average performance bound.

Figures

Figures reproduced from arXiv: 2507.09148 by Alberto Del Pia, Dekun Zhou.

**Figure 1.** Figure 1: Chan-normalized gaps (higher = better). Each panel fixes the sparsity level k and plots the Chan-normalized gaps of the six algorithms across the seven benchmark datasets, ordered from the smallest to the largest dimension (left to right). Our Randomized Algorithm achieves the best objective on 31 of the 41 instances (75%) and never falls below Chan by more than 1.5%. Note that we do not report the result … view at source ↗

**Figure 2.** Figure 2: Wall-clock runtimes (log scale). The same experiments as [PITH_FULL_IMAGE:figures/full_fig_p014_2.png] view at source ↗

read the original abstract

Sparse Principal Component Analysis (SPCA) is a fundamental technique for dimensionality reduction, and is NP-hard. In this paper, we introduce a randomized approximation algorithm for SPCA, which is based on the basic SDP relaxation. Our algorithm takes an (approximate) SDP solution, constructs one deterministic sparse solution and several randomized solutions, and outputs the best among them. Our algorithm has an approximation ratio of at most the sparsity constant with high probability, if called enough times. Under a technical assumption, which is consistently satisfied in our numerical tests, the average approximation ratio is also bounded by $\mathcal{O}(\log{d})$, where $d$ is the number of features. We show that this technical assumption is satisfied if the SDP solution is low-rank, or has exponentially decaying eigenvalues. We then present two classes of instances for which this technical assumption holds. We also demonstrate that in a covariance model, which generalizes the spiked Wishart model, the deterministic solution in our algorithm achieves a near-optimal approximation ratio. We demonstrate the efficacy of our algorithm through numerical tests on real-world datasets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript introduces a randomized approximation algorithm for Sparse Principal Component Analysis (SPCA) based on the basic SDP relaxation. Given an approximate SDP solution, the algorithm constructs one deterministic sparse vector and multiple randomized sparse vectors, then returns the best among them. It proves that the approximation ratio is at most the sparsity level with high probability (provided the algorithm is run sufficiently many times). Under a technical assumption on the spectrum of the SDP solution, the expected approximation ratio is O(log d). The assumption is shown to hold when the SDP solution is low-rank or has exponentially decaying eigenvalues, as well as for two explicit families of instances; the paper also establishes a near-optimal guarantee for the deterministic solution in a covariance model that generalizes the spiked Wishart model and reports supporting numerical experiments on real data.

Significance. If the stated guarantees hold, the work supplies a practical, theoretically grounded method for an NP-hard problem that improves on pure SDP rounding by adding a cheap randomization step. The explicit identification and partial validation of the technical assumption, together with the unconditional high-probability bound and the near-optimal result in the generalized spiked model, constitute a clear contribution to the approximation-algorithm literature on sparse PCA.

major comments (1)

[analysis of expected approximation ratio / technical assumption] The O(log d) average-case guarantee rests on the technical assumption about the SDP solution spectrum (low-rank or exponentially decaying eigenvalues). While the manuscript proves the assumption holds for low-rank solutions, exponentially decaying spectra, and two concrete instance classes, and reports that it is satisfied in the numerical tests, the assumption is not automatically true for arbitrary SDP solutions. An SDP solution whose eigenvalues decay slowly would invalidate the expectation analysis of the randomized rounding step, leaving only the weaker high-probability bound. This assumption is therefore load-bearing for the stronger average-case claim.

minor comments (2)

[abstract] The abstract and introduction should more explicitly separate the unconditional high-probability guarantee from the conditional O(log d) guarantee to avoid any impression that the logarithmic bound holds unconditionally.
[section introducing the technical assumption] A brief discussion or simple counter-example illustrating an SDP solution for which the technical assumption fails would help readers assess the practical scope of the O(log d) result.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. We address the major comment below.

read point-by-point responses

Referee: [analysis of expected approximation ratio / technical assumption] The O(log d) average-case guarantee rests on the technical assumption about the SDP solution spectrum (low-rank or exponentially decaying eigenvalues). While the manuscript proves the assumption holds for low-rank solutions, exponentially decaying spectra, and two concrete instance classes, and reports that it is satisfied in the numerical tests, the assumption is not automatically true for arbitrary SDP solutions. An SDP solution whose eigenvalues decay slowly would invalidate the expectation analysis of the randomized rounding step, leaving only the weaker high-probability bound. This assumption is therefore load-bearing for the stronger average-case claim.

Authors: We agree that the O(log d) expected approximation ratio depends on the stated technical assumption on the spectrum of the SDP solution. The manuscript already proves that the assumption holds for low-rank solutions and for solutions with exponentially decaying eigenvalues, as well as for two explicit families of instances; it further reports that the assumption is satisfied throughout the numerical experiments. We acknowledge that the assumption need not hold for every possible SDP solution, in which case the expectation analysis would not apply and only the unconditional high-probability bound (approximation ratio at most the sparsity level) would remain. To clarify this distinction for readers, we will revise the manuscript to emphasize more explicitly that the high-probability guarantee requires no additional assumptions while the expected O(log d) bound is conditional, and to discuss the practical scope of the assumption in light of the cases where it is provably satisfied. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents a randomized approximation algorithm for SPCA derived from the basic SDP relaxation, with approximation ratios obtained via standard probabilistic analysis of randomized rounding. The high-probability bound of at most the sparsity constant holds unconditionally with enough repetitions, while the O(log d) average bound is derived under an explicitly stated technical assumption on the SDP solution spectrum (low-rank or exponentially decaying eigenvalues), which the authors prove holds for low-rank cases, specific instance families, and a generalized covariance model. These steps rely on mathematical proofs and the SDP formulation itself rather than reducing to fitted parameters from the same data, self-citations bearing the central load, or renaming known results. Numerical tests confirm the assumption but do not define the bounds, leaving the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on the existence of an approximate SDP solution, standard properties of semidefinite programming, and the technical assumption on eigenvalue decay or rank. No new free parameters are introduced beyond the input sparsity level k and the number of random trials. No invented entities are postulated.

axioms (2)

standard math The basic SDP relaxation of the sparse PCA formulation is solvable to sufficient accuracy in polynomial time.
Invoked when the algorithm is stated to take an (approximate) SDP solution as input.
domain assumption Randomized rounding from the SDP solution produces sparse vectors whose expected objective value can be bounded using the technical assumption.
Central to deriving the O(log d) average-case ratio.

pith-pipeline@v0.9.0 · 5718 in / 1613 out tokens · 36011 ms · 2026-05-22T00:00:01.969456+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Our algorithm transforms an (approximate) optimal solution W* of SPCA-SDP into a feasible solution for SPCA, in time O(d log d). ... approximation ratio ... bounded by O(log d) under ... sum of square roots of diagonal entries of W*

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · 3 internal anchors

[1]

High-dimensional analysis of semidefinite relaxations for sparse principal components

Arash A Amini and Martin J Wainwright. High-dimensional analysis of semidefinite relaxations for sparse principal components. In IEEE International Symposium on Information Theory , pages 2454–2458, 2008

work page 2008
[2]

The MOSEK optimization toolbox for MATLAB manual

MOSEK ApS. The MOSEK optimization toolbox for MATLAB manual. Version 9.2. , 2020

work page 2020
[3]

Sparse PCA via bipartite matchings

Megasthenis Asteris, Dimitris Papailiopoulos, Anastasios Kyrillidis, and Alexandros G Di- makis. Sparse PCA via bipartite matchings. Advances in Neural Information Processing Systems, 28, 2015

work page 2015
[4]

Certifiably optimal sparse principal component analysis

Lauren Berk and Dimitris Bertsimas. Certifiably optimal sparse principal component analysis. Mathematical Programming Computation, 11:381–420, 2019

work page 2019
[5]

Computational Lower Bounds for Sparse PCA

Quentin Berthet and Philippe Rigollet. Computational lower bounds for sparse PCA. arXiv preprint arXiv:1304.0828, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013
[6]

Optimal detection of sparse principal components in high dimension

Quentin Berthet and Philippe Rigollet. Optimal detection of sparse principal components in high dimension. 2013. 26

work page 2013
[7]

Solving large-scale sparse PCA to certifiable (near) optimality

Dimitris Bertsimas, Ryan Cory-Wright, and Jean Pauphilet. Solving large-scale sparse PCA to certifiable (near) optimality. J. Mach. Learn. Res. , 23(13):1–35, 2022

work page 2022
[8]

Sparse PCA: A geometric approach

Dimitris Bertsimas and Driss Lahlou Kitane. Sparse PCA: A geometric approach. Journal of Machine Learning Research, 23:1–33, 2022

work page 2022
[9]

A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization

Samuel Burer and Renato DC Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003

work page 2003
[10]

Local minima and convergence in low-rank semidef- inite programming

Samuel Burer and Renato DC Monteiro. Local minima and convergence in low-rank semidef- inite programming. Mathematical programming, 103(3):427–444, 2005

work page 2005
[11]

On the Worst-Case Approximability of Sparse PCA

Siu On Chan, Dimitris Papailiopoulos, and Aviad Rubinstein. On the worst-case approxima- bility of sparse PCA. arXiv preprint arXiv:1507.05950 , 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015
[12]

On the approximability of sparse PCA

Siu On Chan, Dimitris Papailliopoulos, and Aviad Rubinstein. On the approximability of sparse PCA. In Conference on Learning Theory, pages 623–646. PMLR, 2016

work page 2016
[13]

Approximation algorithms for sparse principal component analysis

Agniva Chowdhury, Petros Drineas, David P Woodruff, and Samson Zhou. Approximation algorithms for sparse principal component analysis. arXiv preprint arXiv:2006.12748 , 2020

work page arXiv 2006
[14]

On the burer–monteiro method for general semidefinite programs

Diego Cifuentes. On the burer–monteiro method for general semidefinite programs. Optimiza- tion Letters, 15(6):2299–2309, 2021

work page 2021
[15]

A direct for- mulation for sparse PCA using semidefinite programming

Alexandre d’Aspremont, Laurent Ghaoui, Michael Jordan, and Gert Lanckriet. A direct for- mulation for sparse PCA using semidefinite programming. Advances in neural information processing systems, 17, 2004

work page 2004
[16]

Sparse PCA on fixed-rank matrices

Alberto Del Pia. Sparse PCA on fixed-rank matrices. Mathematical Programming, pages 1–19, 2022

work page 2022
[17]

Efficient sparse PCA via block- diagonalization

Alberto Del Pia, Dekun Zhou, and Yinglun Zhu. Efficient sparse PCA via block- diagonalization. In The Thirteenth International Conference on Learning Representations , 2025

work page 2025
[18]

Sparse PCA via covariance thresholding

Yash Deshpande and Andrea Montanari. Sparse PCA via covariance thresholding. Advances in Neural Information Processing Systems , 27, 2014

work page 2014
[19]

Sparse principal component analysis and its $l_1$-relaxation

Santanu S Dey, Rahul Mazumder, Marco Molinaro, and Guanyi Wang. Sparse principal com- ponent analysis and its l 1-relaxation. arXiv preprint arXiv:1712.00800 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[20]

Using ℓ1-relaxation and integer program- ming to obtain dual bounds for sparse PCA

Santanu S Dey, Rahul Mazumder, and Guanyi Wang. Using ℓ1-relaxation and integer program- ming to obtain dual bounds for sparse PCA. Operations Research, 70(3):1914–1932, 2022

work page 1914
[21]

Solving sparse principal component analysis with global support

Santanu S Dey, Marco Molinaro, and Guanyi Wang. Solving sparse principal component analysis with global support. Mathematical Programming, pages 1–39, 2022

work page 2022
[22]

Subexponential-time algorithms for sparse PCA

Yunzi Ding, Dmitriy Kunisky, Alexander S Wein, and Afonso S Bandeira. Subexponential-time algorithms for sparse PCA. Foundations of Computational Mathematics , pages 1–50, 2023

work page 2023
[23]

Sparse PCA: algo- rithms, adversarial perturbations and certificates

Tommaso d’Orsi, Pravesh K Kothari, Gleb Novikov, and David Steurer. Sparse PCA: algo- rithms, adversarial perturbations and certificates. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) , pages 553–564. IEEE, 2020. 27

work page 2020
[24]

The best constants in the khintchine inequality

Uffe Haagerup. The best constants in the khintchine inequality. Studia Mathematica , 70(3):231–283, 1981

work page 1981
[25]

Matrix analysis, 1985

Roger A Horn and Charles R Johnson. Matrix analysis, 1985

work page 1985
[26]

Principal component analysis for special types of data

Ian T Jolliffe. Principal component analysis for special types of data . Springer, 2002

work page 2002
[27]

Convexification of permutation-invariant sets and an application to sparse PCA

Jinhak Kim, Mohit Tawarmalani, and Jean-Philippe P Richard. Convexification of permutation-invariant sets and an application to sparse PCA. arXiv preprint arXiv:1910.02573, 2019

work page arXiv 1910
[28]

Do semidefinite relaxations solve sparse PCA up to the information limit? The Annals of Statistics , 43(3):1300–1322, 2015

Robert Krauthgamer, Boaz Nadler, and Dan Vilenchik. Do semidefinite relaxations solve sparse PCA up to the information limit? The Annals of Statistics , 43(3):1300–1322, 2015

work page 2015
[29]

Nonlinear programming

Harold W Kuhn and Albert W Tucker. Nonlinear programming. In Traces and emergence of nonlinear programming, pages 247–258. Springer, 2014

work page 2014
[30]

Semidefinite programming and integer programming

Monique Laurent and Franz Rendl. Semidefinite programming and integer programming. In K. Aardal, G. Nemhauser, and R. Weismantel, editors, Handbook on Discrete Optimization , pages 393–514. Elsevier B.V., December 2005

work page 2005
[31]

Sparse principal compo- nent analysis for identifying ancestry-informative markers in genome-wide association studies

Seokho Lee, Michael P Epstein, Richard Duncan, and Xihong Lin. Sparse principal compo- nent analysis for identifying ancestry-informative markers in genome-wide association studies. Genetic epidemiology, 36(4):293–302, 2012

work page 2012
[32]

Li and W

Yongchun Li and Weijun Xie. Exact and approximation algorithms for sparse PCA. arXiv preprint arXiv:2008.12438, 2020

work page arXiv 2008
[33]

Np-hardness and inapproximability of sparse PCA

Malik Magdon-Ismail. Np-hardness and inapproximability of sparse PCA. Information Pro- cessing Letters, 126:35–38, 2017

work page 2017
[34]

Probability and computing: Randomization and proba- bilistic techniques in algorithms and data analysis

Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and proba- bilistic techniques in algorithms and data analysis . Cambridge university press, 2017

work page 2017
[35]

Informative feature selection for object recognition via sparse PCA

Nikhil Naikal, Allen Y Yang, and S Shankar Sastry. Informative feature selection for object recognition via sparse PCA. In 2011 International Conference on Computer Vision , pages 818–825. IEEE, 2011

work page 2011
[36]

SCS: Splitting conic solver, version 3.2.4

Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. SCS: Splitting conic solver, version 3.2.4. https://github.com/cvxgrp/scs, November 2023

work page 2023
[37]

Sparse PCA through low-rank approximations

Dimitris Papailiopoulos, Alexandros Dimakis, and Stavros Korokythakis. Sparse PCA through low-rank approximations. In International Conference on Machine Learning , pages 747–755. PMLR, 2013

work page 2013
[38]

A newton-cg algorithm with complexity guarantees for smooth unconstrained optimization

Cl´ ement W Royer, Michael O’Neill, and Stephen J Wright. A newton-cg algorithm with complexity guarantees for smooth unconstrained optimization. Mathematical Programming, 180:451–488, 2020

work page 2020
[39]

Semidefinite programming

Lieven Vandenberghe and Stephen Boyd. Semidefinite programming. SIAM review, 38(1):49– 95, 1996

work page 1996
[40]

High-dimensional probability: An introduction with applications in data science, volume 47

Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018. 28

work page 2018
[41]

Fantope projection and selection: A near- optimal convex relaxation of sparse PCA

Vincent Q Vu, Juhee Cho, Jing Lei, and Karl Rohe. Fantope projection and selection: A near- optimal convex relaxation of sparse PCA. Advances in neural information processing systems, 26, 2013

work page 2013
[42]

On the tightness of sdp relaxations of qcqps

Alex L Wang and Fatma Kılın¸ c-Karzan. On the tightness of sdp relaxations of qcqps. Mathe- matical Programming, 193(1):33–73, 2022

work page 2022
[43]

Accelerated first-order methods for a class of semidef- inite programs

Alex L Wang and Fatma Kılın¸ c-Karzan. Accelerated first-order methods for a class of semidef- inite programs. Mathematical Programming, 209(1):503–556, 2025

work page 2025
[44]

Statistical and computational trade-offs in estimation of sparse principal components

Tengyao Wang, Quentin Berthet, and Richard J Samworth. Statistical and computational trade-offs in estimation of sparse principal components. The Annals of Statistics , 44(5):1896– 1930, 2016

work page 1930
[45]

A conditional-gradient-based augmented lagrangian framework

Alp Yurtsever, Olivier Fercoq, and Volkan Cevher. A conditional-gradient-based augmented lagrangian framework. In International Conference on Machine Learning , pages 7272–7281. PMLR, 2019

work page 2019
[46]

Adaptive forward-backward greedy algorithm for learning sparse representations

Tong Zhang. Adaptive forward-backward greedy algorithm for learning sparse representations. IEEE transactions on information theory , 57(7):4689–4708, 2011. 29

work page 2011

[1] [1]

High-dimensional analysis of semidefinite relaxations for sparse principal components

Arash A Amini and Martin J Wainwright. High-dimensional analysis of semidefinite relaxations for sparse principal components. In IEEE International Symposium on Information Theory , pages 2454–2458, 2008

work page 2008

[2] [2]

The MOSEK optimization toolbox for MATLAB manual

MOSEK ApS. The MOSEK optimization toolbox for MATLAB manual. Version 9.2. , 2020

work page 2020

[3] [3]

Sparse PCA via bipartite matchings

Megasthenis Asteris, Dimitris Papailiopoulos, Anastasios Kyrillidis, and Alexandros G Di- makis. Sparse PCA via bipartite matchings. Advances in Neural Information Processing Systems, 28, 2015

work page 2015

[4] [4]

Certifiably optimal sparse principal component analysis

Lauren Berk and Dimitris Bertsimas. Certifiably optimal sparse principal component analysis. Mathematical Programming Computation, 11:381–420, 2019

work page 2019

[5] [5]

Computational Lower Bounds for Sparse PCA

Quentin Berthet and Philippe Rigollet. Computational lower bounds for sparse PCA. arXiv preprint arXiv:1304.0828, 2013

work page internal anchor Pith review Pith/arXiv arXiv 2013

[6] [6]

Optimal detection of sparse principal components in high dimension

Quentin Berthet and Philippe Rigollet. Optimal detection of sparse principal components in high dimension. 2013. 26

work page 2013

[7] [7]

Solving large-scale sparse PCA to certifiable (near) optimality

Dimitris Bertsimas, Ryan Cory-Wright, and Jean Pauphilet. Solving large-scale sparse PCA to certifiable (near) optimality. J. Mach. Learn. Res. , 23(13):1–35, 2022

work page 2022

[8] [8]

Sparse PCA: A geometric approach

Dimitris Bertsimas and Driss Lahlou Kitane. Sparse PCA: A geometric approach. Journal of Machine Learning Research, 23:1–33, 2022

work page 2022

[9] [9]

A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization

Samuel Burer and Renato DC Monteiro. A nonlinear programming algorithm for solving semidefinite programs via low-rank factorization. Mathematical Programming, 95(2):329–357, 2003

work page 2003

[10] [10]

Local minima and convergence in low-rank semidef- inite programming

Samuel Burer and Renato DC Monteiro. Local minima and convergence in low-rank semidef- inite programming. Mathematical programming, 103(3):427–444, 2005

work page 2005

[11] [11]

On the Worst-Case Approximability of Sparse PCA

Siu On Chan, Dimitris Papailiopoulos, and Aviad Rubinstein. On the worst-case approxima- bility of sparse PCA. arXiv preprint arXiv:1507.05950 , 2015

work page internal anchor Pith review Pith/arXiv arXiv 2015

[12] [12]

On the approximability of sparse PCA

Siu On Chan, Dimitris Papailliopoulos, and Aviad Rubinstein. On the approximability of sparse PCA. In Conference on Learning Theory, pages 623–646. PMLR, 2016

work page 2016

[13] [13]

Approximation algorithms for sparse principal component analysis

Agniva Chowdhury, Petros Drineas, David P Woodruff, and Samson Zhou. Approximation algorithms for sparse principal component analysis. arXiv preprint arXiv:2006.12748 , 2020

work page arXiv 2006

[14] [14]

On the burer–monteiro method for general semidefinite programs

Diego Cifuentes. On the burer–monteiro method for general semidefinite programs. Optimiza- tion Letters, 15(6):2299–2309, 2021

work page 2021

[15] [15]

A direct for- mulation for sparse PCA using semidefinite programming

Alexandre d’Aspremont, Laurent Ghaoui, Michael Jordan, and Gert Lanckriet. A direct for- mulation for sparse PCA using semidefinite programming. Advances in neural information processing systems, 17, 2004

work page 2004

[16] [16]

Sparse PCA on fixed-rank matrices

Alberto Del Pia. Sparse PCA on fixed-rank matrices. Mathematical Programming, pages 1–19, 2022

work page 2022

[17] [17]

Efficient sparse PCA via block- diagonalization

Alberto Del Pia, Dekun Zhou, and Yinglun Zhu. Efficient sparse PCA via block- diagonalization. In The Thirteenth International Conference on Learning Representations , 2025

work page 2025

[18] [18]

Sparse PCA via covariance thresholding

Yash Deshpande and Andrea Montanari. Sparse PCA via covariance thresholding. Advances in Neural Information Processing Systems , 27, 2014

work page 2014

[19] [19]

Sparse principal component analysis and its $l_1$-relaxation

Santanu S Dey, Rahul Mazumder, Marco Molinaro, and Guanyi Wang. Sparse principal com- ponent analysis and its l 1-relaxation. arXiv preprint arXiv:1712.00800 , 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[20] [20]

Using ℓ1-relaxation and integer program- ming to obtain dual bounds for sparse PCA

Santanu S Dey, Rahul Mazumder, and Guanyi Wang. Using ℓ1-relaxation and integer program- ming to obtain dual bounds for sparse PCA. Operations Research, 70(3):1914–1932, 2022

work page 1914

[21] [21]

Solving sparse principal component analysis with global support

Santanu S Dey, Marco Molinaro, and Guanyi Wang. Solving sparse principal component analysis with global support. Mathematical Programming, pages 1–39, 2022

work page 2022

[22] [22]

Subexponential-time algorithms for sparse PCA

Yunzi Ding, Dmitriy Kunisky, Alexander S Wein, and Afonso S Bandeira. Subexponential-time algorithms for sparse PCA. Foundations of Computational Mathematics , pages 1–50, 2023

work page 2023

[23] [23]

Sparse PCA: algo- rithms, adversarial perturbations and certificates

Tommaso d’Orsi, Pravesh K Kothari, Gleb Novikov, and David Steurer. Sparse PCA: algo- rithms, adversarial perturbations and certificates. In 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS) , pages 553–564. IEEE, 2020. 27

work page 2020

[24] [24]

The best constants in the khintchine inequality

Uffe Haagerup. The best constants in the khintchine inequality. Studia Mathematica , 70(3):231–283, 1981

work page 1981

[25] [25]

Matrix analysis, 1985

Roger A Horn and Charles R Johnson. Matrix analysis, 1985

work page 1985

[26] [26]

Principal component analysis for special types of data

Ian T Jolliffe. Principal component analysis for special types of data . Springer, 2002

work page 2002

[27] [27]

Convexification of permutation-invariant sets and an application to sparse PCA

Jinhak Kim, Mohit Tawarmalani, and Jean-Philippe P Richard. Convexification of permutation-invariant sets and an application to sparse PCA. arXiv preprint arXiv:1910.02573, 2019

work page arXiv 1910

[28] [28]

Do semidefinite relaxations solve sparse PCA up to the information limit? The Annals of Statistics , 43(3):1300–1322, 2015

Robert Krauthgamer, Boaz Nadler, and Dan Vilenchik. Do semidefinite relaxations solve sparse PCA up to the information limit? The Annals of Statistics , 43(3):1300–1322, 2015

work page 2015

[29] [29]

Nonlinear programming

Harold W Kuhn and Albert W Tucker. Nonlinear programming. In Traces and emergence of nonlinear programming, pages 247–258. Springer, 2014

work page 2014

[30] [30]

Semidefinite programming and integer programming

Monique Laurent and Franz Rendl. Semidefinite programming and integer programming. In K. Aardal, G. Nemhauser, and R. Weismantel, editors, Handbook on Discrete Optimization , pages 393–514. Elsevier B.V., December 2005

work page 2005

[31] [31]

Sparse principal compo- nent analysis for identifying ancestry-informative markers in genome-wide association studies

Seokho Lee, Michael P Epstein, Richard Duncan, and Xihong Lin. Sparse principal compo- nent analysis for identifying ancestry-informative markers in genome-wide association studies. Genetic epidemiology, 36(4):293–302, 2012

work page 2012

[32] [32]

Li and W

Yongchun Li and Weijun Xie. Exact and approximation algorithms for sparse PCA. arXiv preprint arXiv:2008.12438, 2020

work page arXiv 2008

[33] [33]

Np-hardness and inapproximability of sparse PCA

Malik Magdon-Ismail. Np-hardness and inapproximability of sparse PCA. Information Pro- cessing Letters, 126:35–38, 2017

work page 2017

[34] [34]

Probability and computing: Randomization and proba- bilistic techniques in algorithms and data analysis

Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and proba- bilistic techniques in algorithms and data analysis . Cambridge university press, 2017

work page 2017

[35] [35]

Informative feature selection for object recognition via sparse PCA

Nikhil Naikal, Allen Y Yang, and S Shankar Sastry. Informative feature selection for object recognition via sparse PCA. In 2011 International Conference on Computer Vision , pages 818–825. IEEE, 2011

work page 2011

[36] [36]

SCS: Splitting conic solver, version 3.2.4

Brendan O’Donoghue, Eric Chu, Neal Parikh, and Stephen Boyd. SCS: Splitting conic solver, version 3.2.4. https://github.com/cvxgrp/scs, November 2023

work page 2023

[37] [37]

Sparse PCA through low-rank approximations

Dimitris Papailiopoulos, Alexandros Dimakis, and Stavros Korokythakis. Sparse PCA through low-rank approximations. In International Conference on Machine Learning , pages 747–755. PMLR, 2013

work page 2013

[38] [38]

A newton-cg algorithm with complexity guarantees for smooth unconstrained optimization

Cl´ ement W Royer, Michael O’Neill, and Stephen J Wright. A newton-cg algorithm with complexity guarantees for smooth unconstrained optimization. Mathematical Programming, 180:451–488, 2020

work page 2020

[39] [39]

Semidefinite programming

Lieven Vandenberghe and Stephen Boyd. Semidefinite programming. SIAM review, 38(1):49– 95, 1996

work page 1996

[40] [40]

High-dimensional probability: An introduction with applications in data science, volume 47

Roman Vershynin. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press, 2018. 28

work page 2018

[41] [41]

Fantope projection and selection: A near- optimal convex relaxation of sparse PCA

Vincent Q Vu, Juhee Cho, Jing Lei, and Karl Rohe. Fantope projection and selection: A near- optimal convex relaxation of sparse PCA. Advances in neural information processing systems, 26, 2013

work page 2013

[42] [42]

On the tightness of sdp relaxations of qcqps

Alex L Wang and Fatma Kılın¸ c-Karzan. On the tightness of sdp relaxations of qcqps. Mathe- matical Programming, 193(1):33–73, 2022

work page 2022

[43] [43]

Accelerated first-order methods for a class of semidef- inite programs

Alex L Wang and Fatma Kılın¸ c-Karzan. Accelerated first-order methods for a class of semidef- inite programs. Mathematical Programming, 209(1):503–556, 2025

work page 2025

[44] [44]

Statistical and computational trade-offs in estimation of sparse principal components

Tengyao Wang, Quentin Berthet, and Richard J Samworth. Statistical and computational trade-offs in estimation of sparse principal components. The Annals of Statistics , 44(5):1896– 1930, 2016

work page 1930

[45] [45]

A conditional-gradient-based augmented lagrangian framework

Alp Yurtsever, Olivier Fercoq, and Volkan Cevher. A conditional-gradient-based augmented lagrangian framework. In International Conference on Machine Learning , pages 7272–7281. PMLR, 2019

work page 2019

[46] [46]

Adaptive forward-backward greedy algorithm for learning sparse representations

Tong Zhang. Adaptive forward-backward greedy algorithm for learning sparse representations. IEEE transactions on information theory , 57(7):4689–4708, 2011. 29

work page 2011