pith. sign in

arxiv: 1011.2771 · v1 · pith:EE7XZQGInew · submitted 2010-11-11 · 📊 stat.ML · math.ST· stat.TH

Stability of Density-Based Clustering

classification 📊 stat.ML math.STstat.TH
keywords lambdadensitywidehatlevelcharacterizedclusterclusteringestimate
0
0 comments X
read the original abstract

High density clusters can be characterized by the connected components of a level set $L(\lambda) = \{x:\ p(x)>\lambda\}$ of the underlying probability density function $p$ generating the data, at some appropriate level $\lambda\geq 0$. The complete hierarchical clustering can be characterized by a cluster tree ${\cal T}= \bigcup_{\lambda} L(\lambda)$. In this paper, we study the behavior of a density level set estimate $\widehat L(\lambda)$ and cluster tree estimate $\widehat{\cal{T}}$ based on a kernel density estimator with kernel bandwidth $h$. We define two notions of instability to measure the variability of $\widehat L(\lambda)$ and $\widehat{\cal{T}}$ as a function of $h$, and investigate the theoretical properties of these instability measures.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.