pith. sign in

arxiv: 1508.00625 · v1 · pith:4HNJSMAKnew · submitted 2015-08-04 · 📊 stat.ML · cs.DS· cs.LG· math.OC

Sparse PCA via Bipartite Matchings

classification 📊 stat.ML cs.DScs.LGmath.OC
keywords componentsdatasparsealgorithmproblembipartitecapturedisjoint
0
0 comments X
read the original abstract

We consider the following multi-component sparse PCA problem: given a set of data points, we seek to extract a small number of sparse components with disjoint supports that jointly capture the maximum possible variance. These components can be computed one by one, repeatedly solving the single-component problem and deflating the input data matrix, but as we show this greedy procedure is suboptimal. We present a novel algorithm for sparse PCA that jointly optimizes multiple disjoint components. The extracted features capture variance that lies within a multiplicative factor arbitrarily close to 1 from the optimal. Our algorithm is combinatorial and computes the desired components by solving multiple instances of the bipartite maximum weight matching problem. Its complexity grows as a low order polynomial in the ambient dimension of the input data matrix, but exponentially in its rank. However, it can be effectively applied on a low-dimensional sketch of the data; this allows us to obtain polynomial-time approximation guarantees via spectral bounds. We evaluate our algorithm on real data-sets and empirically demonstrate that in many cases it outperforms existing, deflation-based approaches.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.