pith. machine review for the scientific record. sign in

arxiv: 1604.06443 · v2 · submitted 2016-04-21 · 💻 cs.DS · cs.IT· cs.LG· math.IT· math.ST· stat.ML· stat.TH

Recognition: unknown

Robust Estimators in High Dimensions without the Computational Intractability

Authors on Pith no claims yet
classification 💻 cs.DS cs.ITcs.LGmath.ITmath.STstat.MLstat.TH
keywords learningdistributionerrorhigh-dimensionalagnosticalgorithmscomputationallydistributions
0
0 comments X
read the original abstract

We study high-dimensional distribution learning in an agnostic setting where an adversary is allowed to arbitrarily corrupt an $\varepsilon$-fraction of the samples. Such questions have a rich history spanning statistics, machine learning and theoretical computer science. Even in the most basic settings, the only known approaches are either computationally inefficient or lose dimension-dependent factors in their error guarantees. This raises the following question:Is high-dimensional agnostic distribution learning even possible, algorithmically? In this work, we obtain the first computationally efficient algorithms with dimension-independent error guarantees for agnostically learning several fundamental classes of high-dimensional distributions: (1) a single Gaussian, (2) a product distribution on the hypercube, (3) mixtures of two product distributions (under a natural balancedness condition), and (4) mixtures of spherical Gaussians. Our algorithms achieve error that is independent of the dimension, and in many cases scales nearly-linearly with the fraction of adversarially corrupted samples. Moreover, we develop a general recipe for detecting and correcting corruptions in high-dimensions, that may be applicable to many other problems.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.