k-Clustering via Iterative Randomized Rounding
Pith reviewed 2026-05-10 18:16 UTC · model grok-4.3
The pith
A single iterative randomized rounding procedure on the standard LP relaxation produces a ((3^p + 1)/2)-LMP approximation for k-clustering with p-th power distance costs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
There exists a single iterative randomized rounding algorithm that, given any fractional solution to the standard LP relaxation of k-clustering, outputs a random integral solution opening k facilities in expectation whose expected cost is at most (3^p + 1)/2 times the LP value (hence at most (3^p + 1)/2 OPT) for p-th power distance costs; this LMP-preserving rounding can be turned into a true ((3^p + 1)/2 + ε)-approximation by standard techniques.
What carries the argument
Iterative randomized rounding of the fractional facility-opening variables that preserves the Lagrangian-multiplier property while respecting the cardinality constraint.
If this is right
- Recovers the best-known (2 + ε) approximation for metric k-median.
- Improves the approximation factor for general-metric k-means from 5.83 to 5 + ε.
- Achieves a (4 + ε) approximation for Euclidean k-means via a specialized analysis.
- Unifies the treatment of all p-power costs under one rounding procedure.
Where Pith is reading between the lines
- The separation between the LMP rounding step and the final (1 + ε) conversion may allow similar roundings to be plugged into other Lagrangian-based frameworks for clustering variants.
- Because the rounding works from any fractional solution, it could be combined with faster LP solvers or sampling-based methods that produce approximate fractional solutions.
- The improvement for Euclidean k-means suggests that geometry-specific analyses can tighten the general (3^p + 1)/2 bound without changing the rounding procedure itself.
Load-bearing premise
The input distances obey the triangle inequality (or Euclidean properties) so that the cost of the rounded solution remains bounded relative to the fractional LP cost.
What would settle it
An explicit instance together with a feasible fractional LP solution whose rounded integral solutions have expected p-power cost strictly larger than (3^p + 1)/2 times the LP value.
read the original abstract
In this work we propose a single rounding algorithm for the fractional solutions of the standard LP relaxation for $k$-clustering. As a starting point, we obtain an iterative rounding $(\frac{3^p + 1}{2})$-Lagrangian Multiplier-Perserving (LMP) approximation for the $k$-clustering problem with the cost function being the $p$-th power of the distance. Such an algorithm outputs a random solution that opens $k$ facilities \emph{in expectation}, whose cost in expectation is at most $\frac{3^p + 1}{2}$ times the optimum cost. Thus, we recover the $2$-LMP approximation for $k$-median by Jain et al.~[JACM'03], which played a central role in deriving the current best $2$ approximation for $k$-median. Unlike the result of Jain et al., our algorithm is based on LP rounding, and it can be easily adapted to the $L_p^p$-cost setting. For the Euclidean $k$-means problem, the LMP factor we obtain is $\frac{11}{3}$, which is better than the $5$ approximation given by this framework for general metrics. Then, we show how to convert the LMP-approximation algorithms to a true-approximation, with only a $(1+\varepsilon)$ factor loss in the approximation ratio. We obtain a ($\frac{3^p + 1}{2}+\varepsilon$)-approximation algorithm for $k$-clustering with cost function being the $p$-th power of the distance, for $p \geq 1$. This reproduces the best known ($2+\varepsilon$)-approximation for $k$-median by Cohen-Addad et al. [STOC'25], and improves the approximation factor for metric $k$-means from 5.83 by Charikar at al. [FOCS'25] to $5+\varepsilon$ in our framework. Moreover, the same algorithm, but with a specialized analysis, attains ($4+\varepsilon$)-approximation for Euclidean $k$-means matching the recent result by Charikar et al. [STOC'26].
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a single iterative randomized rounding algorithm applied to the standard LP relaxation for k-clustering with p-th power distance costs. It yields a ((3^p + 1)/2)-LMP approximation that opens exactly k facilities in expectation, recovering the 2-LMP bound for k-median (p=1) and giving 11/3 for Euclidean k-means. The LMP guarantee is converted via Lagrangian multiplier search to a true ((3^p + 1)/2 + ε)-approximation, improving the metric k-means factor to 5+ε and achieving 4+ε for Euclidean k-means with a specialized analysis.
Significance. If the central claims hold, this provides a unified LP-rounding framework that reproduces the best-known 2+ε approximation for k-median and improves prior bounds for metric k-means while matching recent Euclidean results. The direct derivation of the LMP property from randomized iterative rounding (rather than primal-dual methods) and the clean separation of general-metric and Euclidean analyses are technical strengths that could streamline future work on clustering approximations.
Simulated Author's Rebuttal
We thank the referee for the positive review, the accurate summary of our contributions, and the recommendation to accept. The report lists no major comments.
Circularity Check
No significant circularity
full rationale
The paper derives its ((3^p + 1)/2 + ε)-approximation directly from a new iterative randomized rounding procedure on the standard LP relaxation for k-clustering with p-power costs. The LMP guarantee is obtained by analyzing the rounding's expectation properties and cost bounds, which are constructed and proved within the paper without reducing to fitted inputs or self-definitions. The conversion from LMP to true approximation applies standard Lagrangian multiplier search techniques. Prior citations (e.g., Jain et al. for the p=1 case) are used only for reproduction and context; they are not load-bearing for the novel general-p or Euclidean improvements. The derivation chain is self-contained against external benchmarks and exhibits no circular reductions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The standard LP relaxation for k-clustering is a valid starting point whose fractional solutions can be rounded while preserving the LMP property.
Reference graph
Works this paper leans on
-
[1]
Sara Ahmadian, Ashkan Norouzi-Fard, Ola Svensson, and Justin Ward. Better guarantees for k-means and euclidean k-median by primal-dual algorithms.SIAM Journal on Computing, 49(4):FOCS17–97, 2019
work page 2019
-
[2]
Local search heuristic for k-median and facility location problems
Vijay Arya, Naveen Garg, Rohit Khandekar, Adam Meyerson, Kamesh Munagala, and Vinayaka Pandit. Local search heuristic for k-median and facility location problems. InProceedings of the thirty-third annual ACM symposium on Theory of computing, pages 21–29, 2001
work page 2001
-
[3]
On minimum sum of radii and diameters clustering
Babak Behsaz and Mohammad R Salavatipour. On minimum sum of radii and diameters clustering. Algorithmica, 73(1):143–165, 2015
work page 2015
-
[4]
Jarosław Byrka and Karen Aardal. An optimal bifactor approximation algorithm for the metric unca- pacitated facility location problem.SIAM Journal on Computing, 39(6):2212–2231, 2010
work page 2010
-
[5]
Jarosław Byrka, Thomas Pensyl, Bartosz Rybicki, Aravind Srinivasan, and Khoa Trinh. An improved approximation for k-median and positive correlation in budgeted optimization.ACM Transactions on Algorithms (TALG), 13(2):1–31, 2017. 30
work page 2017
-
[6]
Fault-tolerant facility location: a ran- domized dependent lp-rounding algorithm
Jarosław Byrka, Aravind Srinivasan, and Chaitanya Swamy. Fault-tolerant facility location: a ran- domized dependent lp-rounding algorithm. InInternational Conference on Integer Programming and Combinatorial Optimization, pages 244–257. Springer, 2010
work page 2010
-
[7]
An improved greedy approximation for (metric) k-means
Moses Charikar, Vincent Cohen-Addad, Ruiquan Gao, Fabrizio Grandoni, Euiwoong Le, and Ernest Van Wijland. An improved greedy approximation for (metric) k-means. In2025 IEEE 66th Annual Symposium on Foundations of Computer Science (FOCS), pages 233–240. IEEE, 2025
work page 2025
-
[8]
A (4+ε)-approximation for euclidean k-means via non-monotone dual-fitting
Moses Charikar, Vincent Cohen-Addad, Ruiquan Gao, Fabrizio Grandoni, Euiwoong Lee, and Ernest van Wijland. A (4+ε)-approximation for euclidean k-means via non-monotone dual-fitting. InProceed- ings of the 58th Annual ACM Symposium on Theory of Computing. IEEE, 2026
work page 2026
-
[9]
A constant-factor approximation algorithm for the k-median problem
Moses Charikar, Sudipto Guha, Éva Tardos, and David B Shmoys. A constant-factor approximation algorithm for the k-median problem. InProceedings of the thirty-first annual ACM symposium on Theory of computing, pages 1–10, 1999
work page 1999
-
[10]
Fabián A Chudak and David B Shmoys. Improved approximation algorithms for the uncapacitated facility location problem.SIAM Journal on Computing, 33(1):1–25, 2003
work page 2003
-
[11]
Improved approx- imations for euclidean k-means and k-median, via nested quasi-independent sets
Vincent Cohen-Addad, Hossein Esfandiari, Vahab Mirrokni, and Shyam Narayanan. Improved approx- imations for euclidean k-means and k-median, via nested quasi-independent sets. InProceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing, pages 1621–1628, 2022
work page 2022
-
[12]
A (2+ε)-approximation algorithm for metric k-median
Vincent Cohen-Addad, Fabrizio Grandoni, Euiwoong Lee, Chris Schwiegelshohn, and Ola Svensson. A (2+ε)-approximation algorithm for metric k-median. InProceedings of the 57th Annual ACM Sympo- sium on Theory of Computing, pages 615–624, 2025
work page 2025
-
[13]
Tight fpt approx- imations for k-median and k-means
Vincent Cohen-Addad, Anupam Gupta, Amit Kumar, Euiwoong Lee, and Jason Li. Tight fpt approx- imations for k-median and k-means. In46th International Colloquium on Automata, Languages, and Programming (ICALP 2019), volume 132, pages 42–1. Schloss Dagstuhl–Leibniz-Zentrum fuer Infor- matik, 2019
work page 2019
-
[14]
A ptas for k-means clustering based on weak coresets
Dan Feldman, Morteza Monemizadeh, and Christian Sohler. A ptas for k-means clustering based on weak coresets. InProceedings of the twenty-third annual symposium on Computational geometry, pages 11–18, 2007
work page 2007
-
[15]
Inapproxima- bility of maximum diameter clustering for few clusters
Henry Fleischmann, Kyrylo Karlov, Karthik CS, Ashwin Padaki, and Stepan Zharkov. Inapproxima- bility of maximum diameter clustering for few clusters. InProceedings of the 2025 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 4707–4731. SIAM, 2025
work page 2025
-
[16]
Improved bi-point rounding algorithmsandagoldenbarrierfork-median
Kishen N Gowda, Thomas Pensyl, Aravind Srinivasan, and Khoa Trinh. Improved bi-point rounding algorithmsandagoldenbarrierfork-median. InProceedings of the 2023 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 987–1011. SIAM, 2023
work page 2023
-
[17]
Sudipto Guha and Samir Khuller. Greedy strikes back: Improved facility location algorithms.Journal of algorithms, 31(1):228–248, 1999
work page 1999
-
[18]
Simpler Analyses of Local Search Algorithms for Facility Location
Anupam Gupta and Kanat Tangwongsan. Simpler analyses of local search algorithms for facility loca- tion.arXiv preprint arXiv:0809.2554, 2008
work page Pith review arXiv 2008
-
[19]
Data clustering: 50 years beyond k-means.Pattern recognition letters, 31(8):651–666, 2010
Anil K Jain. Data clustering: 50 years beyond k-means.Pattern recognition letters, 31(8):651–666, 2010
work page 2010
-
[20]
Kamal Jain, Mohammad Mahdian, Evangelos Markakis, Amin Saberi, and Vijay V Vazirani. Greedy facility location algorithms analyzed using dual fitting with factor-revealing lp.Journal of the ACM (JACM), 50(6):795–824, 2003
work page 2003
-
[21]
Kamal Jain and Vijay V Vazirani. Approximation algorithms for metric facility location and k-median problems using the primal-dual schema and lagrangian relaxation.Journal of the ACM (JACM), 48(2):274–296, 2001. 31
work page 2001
-
[22]
Shi Li. A 1.488 approximation algorithm for the uncapacitated facility location problem.Information and Computation, 222:45–58, 2013
work page 2013
-
[23]
Approximating k-median via pseudo-approximation
Shi Li and Ola Svensson. Approximating k-median via pseudo-approximation. Inproceedings of the forty-fifth annual ACM symposium on theory of computing, pages 901–910, 2013
work page 2013
-
[24]
J MacQueen. Multivariate observations. InProceedings ofthe 5th Berkeley symposium on mathematical statisticsand probability, volume 1, pages 281–297. University of California press Oakland, CA, USA, 1967
work page 1967
-
[25]
Alessandro Panconesi and Aravind Srinivasan. Randomized distributed edge coloring via an extension of the chernoff–hoeffding bounds.SIAM Journal on Computing, 26(2):350–368, 1997
work page 1997
-
[26]
Approximation algorithms for facility location prob- lems
David B Shmoys, Éva Tardos, and Karen Aardal. Approximation algorithms for facility location prob- lems. InProceedings of the twenty-ninth annual ACM symposium on Theory of computing, pages 265– 274, 1997
work page 1997
-
[27]
Distributions on level-sets with applications to approximation algorithms
Aravind Srinivasan. Distributions on level-sets with applications to approximation algorithms. In Proceedings 42nd IEEE Symposium on Foundations of Computer Science, pages 588–597. IEEE, 2001. A Obtaining solutions from additive pseudo-solutions Theorem A.1.For any constantp≥1, suppose thatAis ac-additiveα-approximation algorithm for k-clustering with cos...
work page 2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.