Sufficient conditions and sharp thresholds are given for exact recovery via MLE in dependent Gaussian mixture models for community detection, including singular covariances.
PECOK: a convex optimization approach to variable clustering
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
The problem of variable clustering is that of grouping similar components of a $p$-dimensional vector $X=(X_{1},\ldots,X_{p})$, and estimating these groups from $n$ independent copies of $X$. When cluster similarity is defined via $G$-latent models, in which groups of $X$-variables have a common latent generator, and groups are relative to a partition $G$ of the index set $\{1, \ldots, p\}$, the most natural clustering strategy is $K$-means. We explain why this strategy cannot lead to perfect cluster recovery and offer a correction, based on semi-definite programing, that can be viewed as a penalized convex relaxation of $K$-means (PECOK). We introduce a cluster separation measure tailored to $G$-latent models, and derive its minimax lower bound for perfect cluster recovery. The clusters estimated by PECOK are shown to recover $G$ at a near minimax optimal cluster separation rate, a result that holds true even if $K$, the number of clusters, is estimated adaptively from the data. We compare PECOK with appropriate corrections of spectral clustering-type procedures, and show that the former outperforms the latter for perfect cluster recovery of minimally separated clusters.
fields
math.ST 1years
2022 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Exact Recovery of Community Detection in dependent Gaussian Mixture Models
Sufficient conditions and sharp thresholds are given for exact recovery via MLE in dependent Gaussian mixture models for community detection, including singular covariances.