k-Means has Polynomial Smoothed Complexity

David Arthur , Bodo Manthey , Heiko R\"oglin

Authors on Pith no claims yet

classification 💻 cs.DS cs.CCcs.CG

keywords k-meanssmoothedmethodpolynomialtimeanalysisdatainput

read the original abstract

The k-means method is one of the most widely used clustering algorithms, drawing its popularity from its speed in practice. Recently, however, it was shown to have exponential worst-case running time. In order to close the gap between practical performance and theoretical analysis, the k-means method has been studied in the model of smoothed analysis. But even the smoothed analyses so far are unsatisfactory as the bounds are still super-polynomial in the number n of data points. In this paper, we settle the smoothed running time of the k-means method. We show that the smoothed number of iterations is bounded by a polynomial in n and 1/\sigma, where \sigma is the standard deviation of the Gaussian perturbations. This means that if an arbitrary input data set is randomly perturbed, then the k-means method will run in expected polynomial time on that input set.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Variance Matters: Improving Domain Adaptation via Stratified Sampling
cs.LG 2025-12 unverdicted novelty 6.0

VaRDASS improves unsupervised domain adaptation by using stratified sampling to reduce variance in discrepancy estimation for measures like correlation alignment and MMD, with derived error bounds, an optimality proof...