pith. sign in

arxiv: 1802.00459 · v2 · pith:EUH5A3O2new · submitted 2018-02-01 · 💻 cs.DS · cs.LG· stat.ML

Nearly Optimal Dynamic k-Means Clustering for High-Dimensional Data

classification 💻 cs.DS cs.LGstat.ML
keywords dynamicmeansspacealgorithmclusteringdatadeltanearly
0
0 comments X
read the original abstract

We consider the $k$-means clustering problem in the dynamic streaming setting, where points from a discrete Euclidean space $\{1, 2, \ldots, \Delta\}^d$ can be dynamically inserted to or deleted from the dataset. For this problem, we provide a one-pass coreset construction algorithm using space $\tilde{O}(k\cdot \mathrm{poly}(d, \log\Delta))$, where $k$ is the target number of centers. To our knowledge, this is the first dynamic geometric data stream algorithm for $k$-means using space polynomial in dimension and nearly optimal (linear) in $k$.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.