Analysis of Ward's Method

Anna Gro{\ss}wendt; Heiko R\"oglin; Melanie Schmidt

arxiv: 1907.05094 · v1 · pith:C4F7DF5Anew · submitted 2019-07-11 · 💻 cs.DS · cs.CG· cs.LG

Analysis of Ward's Method

Anna Gro{\ss}wendt , Heiko R\"oglin , Melanie Schmidt This is my paper

Pith reviewed 2026-05-24 23:14 UTC · model grok-4.3

classification 💻 cs.DS cs.CGcs.LG

keywords Ward's methodhierarchical clusteringk-meansapproximation algorithmswell-separated clusterscomplete linkagegreedy merging

0 comments

The pith

If the optimal k-clustering is well separated, Ward's method produces a 2-approximation for the k-means objective.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Ward's method is a popular greedy approach to hierarchical clustering that repeatedly merges the two clusters whose combination increases the overall k-means cost by the smallest amount. The analysis shows that when the best possible k-clustering has clusters that are well separated from each other, the output of this method at the step with k clusters is guaranteed to have cost at most twice the optimal. Adding a balance condition on cluster sizes allows the method to recover the exact optimal clustering. The guarantees apply even in high dimensions where many other algorithms struggle. Without the separation assumption the approximation ratio can grow exponentially with dimension.

Core claim

Ward's method computes a 2-approximation with respect to the k-means objective function if the optimal k-clustering is well separated. If additionally the optimal clustering also satisfies a balance condition, then Ward's method fully recovers the optimum solution. These results hold in arbitrary dimension.

What carries the argument

Ward's method, the complete-linkage heuristic that always merges the pair of clusters whose merge increases the k-means cost the least.

Load-bearing premise

The optimal k-clustering must have well-separated clusters for the approximation guarantee to hold.

What would settle it

A data set in high dimension where the optimal clusters are well separated but the hierarchy from Ward's method has a k-means cost more than twice the optimum at the k level.

Figures

Figures reproduced from arXiv: 1907.05094 by Anna Gro{\ss}wendt, Heiko R\"oglin, Melanie Schmidt.

**Figure 1.** Figure 1: Point set Pd from the family of worst-case examples, drawn for d = 2 and d = 3. The heavy points are drawn larger. To further simplify the exposition, the below definitions use points of infinite weight and assume that the optimal cluster centers coincide with these infinite weight points. For any finite realization of the example, that is not the case. To ensure that Ward actually behaves like described i… view at source ↗

**Figure 2.** Figure 2: The pricipal phases of development of a 2-composed cluster. The phases We will use the reordering lemma (Lemma 26) to sort the merges into phases and then analyze the cost of the solution after each phase. In the following, we call a cluster that contains points from more than one optimum cluster composed, more precisely, we call it an ℓ-composed cluster if it contains points from ℓ different optimum clust… view at source ↗

**Figure 3.** Figure 3: Different types of good merge situations. A part of a [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗

read the original abstract

We study Ward's method for the hierarchical $k$-means problem. This popular greedy heuristic is based on the \emph{complete linkage} paradigm: Starting with all data points as singleton clusters, it successively merges two clusters to form a clustering with one cluster less. The pair of clusters is chosen to (locally) minimize the $k$-means cost of the clustering in the next step. Complete linkage algorithms are very popular for hierarchical clustering problems, yet their theoretical properties have been studied relatively little. For the Euclidean $k$-center problem, Ackermann et al. show that the $k$-clustering in the hierarchy computed by complete linkage has a worst-case approximation ratio of $\Theta(\log k)$. If the data lies in $\mathbb{R}^d$ for constant dimension $d$, the guarantee improves to $\mathcal{O}(1)$, but the $\mathcal{O}$-notation hides a linear dependence on $d$. Complete linkage for $k$-median or $k$-means has not been analyzed so far. In this paper, we show that Ward's method computes a $2$-approximation with respect to the $k$-means objective function if the optimal $k$-clustering is well separated. If additionally the optimal clustering also satisfies a balance condition, then Ward's method fully recovers the optimum solution. These results hold in arbitrary dimension. We accompany our positive results with a lower bound of $\Omega((3/2)^d)$ for data sets in $\mathbb{R}^d$ that holds if no separation is guaranteed, and with lower bounds when the guaranteed separation is not sufficiently strong. Finally, we show that Ward produces an $\mathcal{O}(1)$-approximative clustering for one-dimensional data sets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

First analysis of Ward's method for k-means gives conditional 2-approx and exact recovery under separation, plus dimension-dependent lower bounds.

read the letter

This paper delivers the first approximation analysis of Ward's method specifically for the k-means objective. It proves a 2-approximation when the optimal clustering is well-separated, and exact recovery of the optimum if the clusters are also balanced. Both results hold in arbitrary dimension. It pairs these with an Omega((3/2)^d) lower bound in d dimensions when separation is absent, plus further lower bounds for weak separation, and an unconditional O(1)-approximation in one dimension. Prior work had only treated complete linkage for k-center, so the extension to k-means is the main novelty. The positive results are clean and the conditioning is stated up front rather than hidden. The lower bounds usefully show that the separation assumption is necessary in high dimensions. The 1D result stands out as a case where no extra assumptions are needed. The claims rest on standard k-means properties and explicit separation conditions, with no sign of circular definitions or fitted parameters. The main limitation is that the 2-approx and recovery guarantees do not apply in the worst case without separation; this is typical for this style of analysis but restricts how broadly the positive results can be used in practice. The dimension dependence in the lower bound is expected and not a flaw. Overall the work is focused and the argument structure holds together from the abstract and stress-test notes. It is aimed at researchers who study theoretical guarantees for clustering heuristics. The paper is worth a serious referee because it fills a clear gap with both upper and lower bounds, even if the results are conditional.

Referee Report

0 major / 3 minor

Summary. The manuscript analyzes Ward's method (a complete-linkage greedy algorithm) for hierarchical k-means clustering. It proves that if the optimal k-clustering is well-separated, then Ward's method yields a 2-approximation to the k-means objective in arbitrary dimension; with an added balance condition on the optimal clustering, it recovers the optimum exactly. These positive results are accompanied by a dimension-dependent lower bound of Ω((3/2)^d) when no separation is assumed, further lower bounds for insufficient separation, and an unconditional O(1)-approximation for one-dimensional data.

Significance. If the claims hold, the work supplies the first non-trivial approximation and recovery guarantees for Ward's method on k-means, a widely used practical heuristic whose theoretical properties had remained largely unanalyzed. The explicit conditioning on separation (and balance), the matching-style lower bounds, and the unconditional 1D result together delineate the regime in which the method succeeds. This advances the theory of linkage-based clustering algorithms and supplies concrete, falsifiable conditions under which a popular greedy procedure is provably effective.

minor comments (3)

[Abstract] Abstract: the terms 'well separated' and 'balance condition' are used without even a one-sentence gloss; a brief parenthetical definition or forward reference to their formal statements (presumably in §2 or §3) would improve readability for readers who consult only the abstract.
[Lower bounds section] The lower-bound constructions (especially the Ω((3/2)^d) family) are stated to hold 'if no separation is guaranteed'; it would be helpful to include a short remark clarifying whether these constructions also satisfy or violate the balance condition used in the positive recovery theorem.
[Preliminaries] Notation for the k-means cost and the separation parameter should be introduced once, early, and used consistently; occasional re-definition of symbols across sections can be avoided by a single 'Notation' paragraph.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the manuscript, the clear summary of our results, and the recommendation for minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives its 2-approximation and exact recovery results for Ward's method explicitly from the assumption that the optimal k-clustering is well-separated (and balanced for recovery). These are standard conditional approximation guarantees in clustering analysis, obtained via direct analysis of the merge steps under the separation property rather than any self-referential definition, fitted parameter renamed as prediction, or load-bearing self-citation. The lower bounds are likewise derived from explicit constructions without reduction to the positive claims. The derivation chain is self-contained against the stated assumptions and does not invoke uniqueness theorems or ansatzes from prior author work in a circular manner.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The positive results rest on the domain assumption of well-separated (and balanced) optimal clusters plus standard mathematical properties of Euclidean k-means cost; no free parameters, invented entities, or ad-hoc axioms are introduced beyond these.

axioms (2)

standard math k-means cost is defined via squared Euclidean distances to cluster centers
Invoked throughout as the objective; standard definition in the field.
domain assumption separation and balance are properties of the optimal clustering that can be assumed for the analysis
Explicitly stated as the conditions under which the 2-approx and recovery hold.

pith-pipeline@v0.9.0 · 5853 in / 1347 out tokens · 21859 ms · 2026-05-24T23:14:58.975551+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · 3 internal anchors

[1]

Ackermann, Johannes Blömer, Daniel Kuntze, and Christian Sohler

Marcel R. Ackermann, Johannes Blömer, Daniel Kuntze, and Christian Sohler. Analysis of agglomerative clustering. Algorithmica, 69(1):184–215, 2014

work page 2014
[2]

Better guarantees for k-means and euclidean k-median by primal-dual algorithms

Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and J ustin Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms. In Proceedings of the 58th IEEE Annual Symposium on Foundations of Computer Science (FOCS) , pages 61–72, 2017

work page 2017
[3]

NP-hardness of Euclidean sum-of-squares clustering

Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preya s Popat. NP-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245–248, 2009

work page 2009
[4]

k-means has polynomial smoothed complexity

David Arthur, Bodo Manthey, and Heiko Röglin. k-means has polynomial smoothed complexity. In Proceedings of the 50th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), pages 405–414, 2009

work page 2009
[5]

How slow is the k-means method? In Proceedings of the 22nd International Symposium on Computational Geometry (S oCG), pages 144–153, 2006

David Arthur and Sergei Vassilvitskii. How slow is the k-means method? In Proceedings of the 22nd International Symposium on Computational Geometry (S oCG), pages 144–153, 2006

work page 2006
[6]

k-means++: the a dvantages of careful seeding

David Arthur and Sergei Vassilvitskii. k-means++: the a dvantages of careful seeding. In Proceedings of the 18th ACM-SIAM Symposium on Discrete Algo rithms (SODA) , pages 1027– 1035, 2007

work page 2007
[7]

Worst-case and s moothed analysis of the ICP algorithm, with an application to the k-means method

David Arthur and Sergei Vassilvitskii. Worst-case and s moothed analysis of the ICP algorithm, with an application to the k-means method. SIAM Journal on Computing , 39(2):766–782, 2009

work page 2009
[8]

Center-based clustering under perturbation stability

Pranjal Awasthi, A vrim Blum, and Or Sheﬀet. Center-based clustering under perturbation stability. Information Processing Letters, 112(1-2):49–54, 2012

work page 2012
[9]

The hardness of approximation of euclidean k-means

Pranjal Awasthi, Moses Charikar, Ravishankar Krishnas wamy, and Ali Kemal Sinop. The hardness of approximation of euclidean k-means. In Proceedings of the 31st International Symposium on Computational Geometry (SoCG) , pages 754–767, 2015. 24

work page 2015
[10]

Improved spectral-norm bounds for clustering

Pranjal Awasthi and Or Sheﬀet. Improved spectral-norm bounds for clustering. In Proceedings of the 15th APPROX and 16th RANDOM , pages 37–49, 2012

work page 2012
[11]

A discriminative framework for clustering via similarity functions

Maria-Florina Balcan, A vrim Blum, and Santosh Vempala. A discriminative framework for clustering via similarity functions. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC) , pages 671–680, 2008

work page 2008
[12]

Clustering unde r perturbation resilience

Maria-Florina Balcan and Yingyu Liang. Clustering unde r perturbation resilience. SIAM Journal on Computing , 45(1):102–155, 2016

work page 2016
[13]

R obust hierarchical clustering

Maria-Florina Balcan, Yingyu Liang, and Pramod Gupta. R obust hierarchical clustering. Journal of Machine Learning Research , 15(1):3831–3871, 2014. Appendix C, page 4048

work page 2014
[14]

Computational Feasibility of Clustering under Clusterability Assumptions

Shai Ben-David. Computational feasibility of clusteri ng under clusterability assumptions. CoRR, abs/1501.00437, 2015. URL: http://arxiv.org/abs/1501.00437, arXiv:1501.00437

work page internal anchor Pith review Pith/arXiv arXiv 2015
[15]

Clustering in the pre sence of background noise

Shai Ben-David and Nika Haghtalab. Clustering in the pre sence of background noise. In Proceedings of the 31th International Conference on Machin e Learn- ing, ICML 2014, Beijing, China, 21-26 June 2014 , pages 280–288, 2014. URL: http://jmlr.org/proceedings/papers/v32/ben-david14.html

work page 2014
[16]

Approximate hierarchical clustering via sparsest cut and spreading metrics

Moses Charikar and Vaggos Chatziafratis. Approximate hierarchical clustering via sparsest cut and spreading metrics. In Proceedings of the 28h Annual ACM-SIAM Symposium on Discret e Algorithms (SODA) , pages 841–854, 2017

work page 2017
[17]

Hier- archical clustering: Objective functions and algorithms

Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann -Trenn, and Claire Mathieu. Hier- archical clustering: Objective functions and algorithms. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 378–397, 2018

work page 2018
[18]

Amit Daniely, Nati Linial, and Michael E. Saks. Cluster ing is diﬃcult only when it does not matter. CoRR, abs/1205.4891, 2012. URL: http://arxiv.org/abs/1205.4891, arXiv:1205.4891

work page internal anchor Pith review Pith/arXiv arXiv 2012
[19]

How fast is k-means? In Proceedings of the 16th Annual Conference on Learning Theory, page 735, 2003

Sanjoy Dasgupta. How fast is k-means? In Proceedings of the 16th Annual Conference on Learning Theory, page 735, 2003

work page 2003
[20]

A cost function for similarity-based hierarchical clustering

Sanjoy Dasgupta. A cost function for similarity-based hierarchical clustering. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing ( STOC), pages 118–127, 2016

work page 2016
[21]

Sanjoy Dasgupta and Philip M. Long. Performance guaran tees for hierarchical clus- tering. Journal of Computer and System Sciences , 70(4):555–569, 2005. URL: http://dx.doi.org/10.1016/j.jcss.2004.10.006, doi:10.1016/j.jcss.2004.10.006

work page doi:10.1016/j.jcss.2004.10.006 2005
[22]

Gonzalés

Teoﬁlo F. Gonzalés. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306, 1985

work page 1985
[23]

Improved analysis of c omplete-linkage clustering

Anna Großwendt and Heiko Röglin. Improved analysis of c omplete-linkage clustering. Algo- rithmica, 78(4):1131–1150, 2017

work page 2017
[24]

Clustering with spect ral norm and the k-means al- gorithm

Amit Kumar and Ravindran Kannan. Clustering with spect ral norm and the k-means al- gorithm. In Proceedings of the 51th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), pages 299–308, 2010. 25

work page 2010
[25]

Fin ding meaningful cluster structure amidst background noise

Shrinu Kushagra, Samira Samadi, and Shai Ben-David. Fin ding meaningful cluster structure amidst background noise. In 27th International Conference on Algorithmic Learning The ory (ALT), pages 339–354, 2016

work page 2016
[26]

Mirrokni, and Ilya P

Silvio Lattanzi, Stefano Leonardi, Vahab S. Mirrokni, and Ilya P. Razenshteyn. Robust hierar- chical k-center clustering. In Proceedings of the 2015 Conference on Innovations in Theore tical Computer Science (ITCS) , pages 211–218, 2015

work page 2015
[27]

Improv ed and simpliﬁed inapproximability for k-means

Euiwoong Lee, Melanie Schmidt, and John Wright. Improv ed and simpliﬁed inapproximability for k-means. Information Processing Letters, 120:40–43, 2017

work page 2017
[28]

Williamson

Guolong Lin, Chandrashekhar Nagarajan, Rajmohan Raja raman, and David P. Williamson. A general approach for incremental approximation and hierar chical clustering. SIAM Journal on Computing, 39(8):3633–3669, 2010

work page 2010
[29]

Stuart P. Lloyd. Least squares quantization in PCM. Bell Laboratories Technical Memorandum,

work page
[30]

later published as [30]

work page
[31]

Stuart P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982

work page 1982
[32]

Var adarajan

Meena Mahajan, Prajakta Nimbhorkar, and Kasturi R. Var adarajan. The Planar k-means Problem is NP-Hard. In Proceedings of the 3rd Workshop on Algorithms and Computati on (W ALCOM), pages 274–285, 2009

work page 2009
[33]

Metric Perturbation Resilience

Konstantin Makarychev and Yury Makarychev. Metric per turbation resilience. CoRR, abs/1607.06442, 2016. URL: http://arxiv.org/abs/1607.06442, arXiv:1607.06442

work page internal anchor Pith review Pith/arXiv arXiv 2016
[34]

Improved smoothed analys is of the k-means method

Bodo Manthey and Heiko Röglin. Improved smoothed analys is of the k-means method. In Proceedings of the 20th ACM-SIAM Symposium on Discrete Algo rithms (SODA) , pages 461– 470, 2009

work page 2009
[35]

Mettu and C

Ramgopal R. Mettu and C. Greg Plaxton. The online median problem. SIAM Journal on Computing, 32(3):816–832, 2003

work page 2003
[36]

Schulman, a nd Chaitanya Swamy

Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, a nd Chaitanya Swamy. The eﬀectiveness of Lloyd-type methods for the k-means problem. Journal of the ACM , 59(6):28:1–28:22, 2012

work page 2012
[37]

Greg Plaxton

C. Greg Plaxton. Approximation algorithms for hierarc hical location problems. Journal of Computer and System Sciences , 72(3):425–443, 2006

work page 2006
[38]

Estimating the number of clusters in a dataset via the gap statistic

Robert Tibshirani, Guenther Walther, and Trevor Hasti e. Estimating the number of clusters in a dataset via the gap statistic. Journal of the Royal Statistical Society: Series B (Statist ical Methodology), 63:411–423, 2001

work page 2001
[39]

k-means requires exponentially many iterations even in the p lane

Andrea Vattani. k-means requires exponentially many iterations even in the p lane. Discrete & Computational Geometry, 45(4):596–616, 2011

work page 2011
[40]

Joe H. Ward Jr. Hierarchical grouping to optimize an obj ective function. Journal of the American Statistical Association , 58:236–244, 1963. 26

work page 1963

[1] [1]

Ackermann, Johannes Blömer, Daniel Kuntze, and Christian Sohler

Marcel R. Ackermann, Johannes Blömer, Daniel Kuntze, and Christian Sohler. Analysis of agglomerative clustering. Algorithmica, 69(1):184–215, 2014

work page 2014

[2] [2]

Better guarantees for k-means and euclidean k-median by primal-dual algorithms

Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and J ustin Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms. In Proceedings of the 58th IEEE Annual Symposium on Foundations of Computer Science (FOCS) , pages 61–72, 2017

work page 2017

[3] [3]

NP-hardness of Euclidean sum-of-squares clustering

Daniel Aloise, Amit Deshpande, Pierre Hansen, and Preya s Popat. NP-hardness of Euclidean sum-of-squares clustering. Machine Learning, 75(2):245–248, 2009

work page 2009

[4] [4]

k-means has polynomial smoothed complexity

David Arthur, Bodo Manthey, and Heiko Röglin. k-means has polynomial smoothed complexity. In Proceedings of the 50th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), pages 405–414, 2009

work page 2009

[5] [5]

How slow is the k-means method? In Proceedings of the 22nd International Symposium on Computational Geometry (S oCG), pages 144–153, 2006

David Arthur and Sergei Vassilvitskii. How slow is the k-means method? In Proceedings of the 22nd International Symposium on Computational Geometry (S oCG), pages 144–153, 2006

work page 2006

[6] [6]

k-means++: the a dvantages of careful seeding

David Arthur and Sergei Vassilvitskii. k-means++: the a dvantages of careful seeding. In Proceedings of the 18th ACM-SIAM Symposium on Discrete Algo rithms (SODA) , pages 1027– 1035, 2007

work page 2007

[7] [7]

Worst-case and s moothed analysis of the ICP algorithm, with an application to the k-means method

David Arthur and Sergei Vassilvitskii. Worst-case and s moothed analysis of the ICP algorithm, with an application to the k-means method. SIAM Journal on Computing , 39(2):766–782, 2009

work page 2009

[8] [8]

Center-based clustering under perturbation stability

Pranjal Awasthi, A vrim Blum, and Or Sheﬀet. Center-based clustering under perturbation stability. Information Processing Letters, 112(1-2):49–54, 2012

work page 2012

[9] [9]

The hardness of approximation of euclidean k-means

Pranjal Awasthi, Moses Charikar, Ravishankar Krishnas wamy, and Ali Kemal Sinop. The hardness of approximation of euclidean k-means. In Proceedings of the 31st International Symposium on Computational Geometry (SoCG) , pages 754–767, 2015. 24

work page 2015

[10] [10]

Improved spectral-norm bounds for clustering

Pranjal Awasthi and Or Sheﬀet. Improved spectral-norm bounds for clustering. In Proceedings of the 15th APPROX and 16th RANDOM , pages 37–49, 2012

work page 2012

[11] [11]

A discriminative framework for clustering via similarity functions

Maria-Florina Balcan, A vrim Blum, and Santosh Vempala. A discriminative framework for clustering via similarity functions. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing (STOC) , pages 671–680, 2008

work page 2008

[12] [12]

Clustering unde r perturbation resilience

Maria-Florina Balcan and Yingyu Liang. Clustering unde r perturbation resilience. SIAM Journal on Computing , 45(1):102–155, 2016

work page 2016

[13] [13]

R obust hierarchical clustering

Maria-Florina Balcan, Yingyu Liang, and Pramod Gupta. R obust hierarchical clustering. Journal of Machine Learning Research , 15(1):3831–3871, 2014. Appendix C, page 4048

work page 2014

[14] [14]

Computational Feasibility of Clustering under Clusterability Assumptions

Shai Ben-David. Computational feasibility of clusteri ng under clusterability assumptions. CoRR, abs/1501.00437, 2015. URL: http://arxiv.org/abs/1501.00437, arXiv:1501.00437

work page internal anchor Pith review Pith/arXiv arXiv 2015

[15] [15]

Clustering in the pre sence of background noise

Shai Ben-David and Nika Haghtalab. Clustering in the pre sence of background noise. In Proceedings of the 31th International Conference on Machin e Learn- ing, ICML 2014, Beijing, China, 21-26 June 2014 , pages 280–288, 2014. URL: http://jmlr.org/proceedings/papers/v32/ben-david14.html

work page 2014

[16] [16]

Approximate hierarchical clustering via sparsest cut and spreading metrics

Moses Charikar and Vaggos Chatziafratis. Approximate hierarchical clustering via sparsest cut and spreading metrics. In Proceedings of the 28h Annual ACM-SIAM Symposium on Discret e Algorithms (SODA) , pages 841–854, 2017

work page 2017

[17] [17]

Hier- archical clustering: Objective functions and algorithms

Vincent Cohen-Addad, Varun Kanade, Frederik Mallmann -Trenn, and Claire Mathieu. Hier- archical clustering: Objective functions and algorithms. In Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA) , pages 378–397, 2018

work page 2018

[18] [18]

Amit Daniely, Nati Linial, and Michael E. Saks. Cluster ing is diﬃcult only when it does not matter. CoRR, abs/1205.4891, 2012. URL: http://arxiv.org/abs/1205.4891, arXiv:1205.4891

work page internal anchor Pith review Pith/arXiv arXiv 2012

[19] [19]

How fast is k-means? In Proceedings of the 16th Annual Conference on Learning Theory, page 735, 2003

Sanjoy Dasgupta. How fast is k-means? In Proceedings of the 16th Annual Conference on Learning Theory, page 735, 2003

work page 2003

[20] [20]

A cost function for similarity-based hierarchical clustering

Sanjoy Dasgupta. A cost function for similarity-based hierarchical clustering. In Proceedings of the 48th Annual ACM SIGACT Symposium on Theory of Computing ( STOC), pages 118–127, 2016

work page 2016

[21] [21]

Sanjoy Dasgupta and Philip M. Long. Performance guaran tees for hierarchical clus- tering. Journal of Computer and System Sciences , 70(4):555–569, 2005. URL: http://dx.doi.org/10.1016/j.jcss.2004.10.006, doi:10.1016/j.jcss.2004.10.006

work page doi:10.1016/j.jcss.2004.10.006 2005

[22] [22]

Gonzalés

Teoﬁlo F. Gonzalés. Clustering to minimize the maximum intercluster distance. Theoretical Computer Science, 38:293–306, 1985

work page 1985

[23] [23]

Improved analysis of c omplete-linkage clustering

Anna Großwendt and Heiko Röglin. Improved analysis of c omplete-linkage clustering. Algo- rithmica, 78(4):1131–1150, 2017

work page 2017

[24] [24]

Clustering with spect ral norm and the k-means al- gorithm

Amit Kumar and Ravindran Kannan. Clustering with spect ral norm and the k-means al- gorithm. In Proceedings of the 51th Annual IEEE Symposium on Foundation s of Computer Science (FOCS), pages 299–308, 2010. 25

work page 2010

[25] [25]

Fin ding meaningful cluster structure amidst background noise

Shrinu Kushagra, Samira Samadi, and Shai Ben-David. Fin ding meaningful cluster structure amidst background noise. In 27th International Conference on Algorithmic Learning The ory (ALT), pages 339–354, 2016

work page 2016

[26] [26]

Mirrokni, and Ilya P

Silvio Lattanzi, Stefano Leonardi, Vahab S. Mirrokni, and Ilya P. Razenshteyn. Robust hierar- chical k-center clustering. In Proceedings of the 2015 Conference on Innovations in Theore tical Computer Science (ITCS) , pages 211–218, 2015

work page 2015

[27] [27]

Improv ed and simpliﬁed inapproximability for k-means

Euiwoong Lee, Melanie Schmidt, and John Wright. Improv ed and simpliﬁed inapproximability for k-means. Information Processing Letters, 120:40–43, 2017

work page 2017

[28] [28]

Williamson

Guolong Lin, Chandrashekhar Nagarajan, Rajmohan Raja raman, and David P. Williamson. A general approach for incremental approximation and hierar chical clustering. SIAM Journal on Computing, 39(8):3633–3669, 2010

work page 2010

[29] [29]

Stuart P. Lloyd. Least squares quantization in PCM. Bell Laboratories Technical Memorandum,

work page

[30] [30]

later published as [30]

work page

[31] [31]

Stuart P. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129–137, 1982

work page 1982

[32] [32]

Var adarajan

Meena Mahajan, Prajakta Nimbhorkar, and Kasturi R. Var adarajan. The Planar k-means Problem is NP-Hard. In Proceedings of the 3rd Workshop on Algorithms and Computati on (W ALCOM), pages 274–285, 2009

work page 2009

[33] [33]

Metric Perturbation Resilience

Konstantin Makarychev and Yury Makarychev. Metric per turbation resilience. CoRR, abs/1607.06442, 2016. URL: http://arxiv.org/abs/1607.06442, arXiv:1607.06442

work page internal anchor Pith review Pith/arXiv arXiv 2016

[34] [34]

Improved smoothed analys is of the k-means method

Bodo Manthey and Heiko Röglin. Improved smoothed analys is of the k-means method. In Proceedings of the 20th ACM-SIAM Symposium on Discrete Algo rithms (SODA) , pages 461– 470, 2009

work page 2009

[35] [35]

Mettu and C

Ramgopal R. Mettu and C. Greg Plaxton. The online median problem. SIAM Journal on Computing, 32(3):816–832, 2003

work page 2003

[36] [36]

Schulman, a nd Chaitanya Swamy

Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, a nd Chaitanya Swamy. The eﬀectiveness of Lloyd-type methods for the k-means problem. Journal of the ACM , 59(6):28:1–28:22, 2012

work page 2012

[37] [37]

Greg Plaxton

C. Greg Plaxton. Approximation algorithms for hierarc hical location problems. Journal of Computer and System Sciences , 72(3):425–443, 2006

work page 2006

[38] [38]

Estimating the number of clusters in a dataset via the gap statistic

Robert Tibshirani, Guenther Walther, and Trevor Hasti e. Estimating the number of clusters in a dataset via the gap statistic. Journal of the Royal Statistical Society: Series B (Statist ical Methodology), 63:411–423, 2001

work page 2001

[39] [39]

k-means requires exponentially many iterations even in the p lane

Andrea Vattani. k-means requires exponentially many iterations even in the p lane. Discrete & Computational Geometry, 45(4):596–616, 2011

work page 2011

[40] [40]

Joe H. Ward Jr. Hierarchical grouping to optimize an obj ective function. Journal of the American Statistical Association , 58:236–244, 1963. 26

work page 1963