(Un)detectable cluster structure in sparse networks

Joerg Reichardt; Michele Leone

read the original abstract

We study the problem of recovering a known cluster structure in a sparse network, also known as the planted partitioning problem, by means of statistical mechanics. We find a sharp transition from un-recoverable to recoverable structure as a function of the separation of the clusters. For multivariate data, such transitions have been observed frequently, but always as a function of the number of data points provided, i.e. given a large enough data set, two point clouds can always be recognized as different clusters, as long as their separation is non-zero. In contrast, for the sparse networks studied here, a cluster structure remains undetectable even in an infinitely large network if a critical separation is not exceeded. We give analytic formulas for this critical separation as a function of the degree distribution of the network and calculate the shape of the recoverability-transition. Our findings have implications for unsupervised learning and data-mining in relational data bases and provide bounds on the achievable performance of graph clustering algorithms.

(Un)detectable cluster structure in sparse networks

discussion (0)