pith. sign in

arxiv: 1306.2035 · v1 · pith:VQJEQ5ZQnew · submitted 2013-06-09 · 📊 stat.ML · cs.LG· math.ST· stat.TH

Minimax Theory for High-dimensional Gaussian Mixtures with Sparse Mean Separation

classification 📊 stat.ML cs.LGmath.STstat.TH
keywords meanseparationdimensionsboundsclusteringcomplexitycomputationallyefficient
0
0 comments X p. Extension
pith:VQJEQ5ZQ Add to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{VQJEQ5ZQ}

Prints a linked pith:VQJEQ5ZQ badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

While several papers have investigated computationally and statistically efficient methods for learning Gaussian mixtures, precise minimax bounds for their statistical performance as well as fundamental limits in high-dimensional settings are not well-understood. In this paper, we provide precise information theoretic bounds on the clustering accuracy and sample complexity of learning a mixture of two isotropic Gaussians in high dimensions under small mean separation. If there is a sparse subset of relevant dimensions that determine the mean separation, then the sample complexity only depends on the number of relevant dimensions and mean separation, and can be achieved by a simple computationally efficient procedure. Our results provide the first step of a theoretical basis for recent methods that combine feature selection and clustering.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.