pith. sign in

arxiv: 1610.02918 · v1 · pith:BGCADD7Gnew · submitted 2016-10-10 · 📊 stat.ML · cond-mat.dis-nn· cs.IT· math.IT

Phase transitions and optimal algorithms in high-dimensional Gaussian mixture clustering

classification 📊 stat.ML cond-mat.dis-nncs.ITmath.IT
keywords alphaclustersalgorithmsclusteringdeterminegaussianhigh-dimensionalinformation-theoretically
0
0 comments X
read the original abstract

We consider the problem of Gaussian mixture clustering in the high-dimensional limit where the data consists of $m$ points in $n$ dimensions, $n,m \rightarrow \infty$ and $\alpha = m/n$ stays finite. Using exact but non-rigorous methods from statistical physics, we determine the critical value of $\alpha$ and the distance between the clusters at which it becomes information-theoretically possible to reconstruct the membership into clusters better than chance. We also determine the accuracy achievable by the Bayes-optimal estimation algorithm. In particular, we find that when the number of clusters is sufficiently large, $r > 4 + 2 \sqrt{\alpha}$, there is a gap between the threshold for information-theoretically optimal performance and the threshold at which known algorithms succeed.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.