pith. sign in

arxiv: 1702.08607 · v1 · pith:QZZU4CUPnew · submitted 2017-02-28 · 💻 cs.CG

Faster DB-scan and HDB-scan in Low-Dimensional Euclidean Spaces

classification 💻 cs.CG
keywords algorithmdbscanversionfixedhdbscanparameterpracticetime
0
0 comments X
read the original abstract

We present a new algorithm for the widely used density-based clustering method DBscan. Our algorithm computes the DBscan-clustering in $O(n\log n)$ time in $\mathbb{R}^2$, irrespective of the scale parameter $\varepsilon$ (and assuming the second parameter MinPts is set to a fixed constant, as is the case in practice). Experiments show that the new algorithm is not only fast in theory, but that a slightly simplified version is competitive in practice and much less sensitive to the choice of $\varepsilon$ than the original DBscan algorithm. We also present an $O(n\log n)$ randomized algorithm for HDBscan in the plane---HDBscan is a hierarchical version of DBscan introduced recently---and we show how to compute an approximate version of HDBscan in near-linear time in any fixed dimension.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.