On the power of linear programming for K-means clustering
Pith reviewed 2026-05-24 04:13 UTC · model grok-4.3
The pith
A linear programming relaxation for K-means clustering returns exact solutions on nearly all tested instances.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The LP relaxation introduced in prior work is tight for K-means clustering under sufficient conditions when K equals two, recovers the correct clusters under a popular stochastic model, and is observed to be tight on the great majority of both synthetic and real-world data sets; the same relaxation can be solved at practical scale by a cutting-plane algorithm that uses the PDLP solver.
What carries the argument
The linear-programming relaxation containing Omega(n cubed) inequalities, whose optimal solution is feasible for the original K-means problem under the derived conditions.
If this is right
- When the input meets the two-cluster sufficient conditions the LP solves the K-means problem exactly.
- Under the stochastic model the LP recovers the planted clusters with high probability.
- A cutting-plane algorithm with the PDLP solver computes the relaxation for up to four thousand points.
- There exist explicit families of inputs for which the relaxation is never tight.
Where Pith is reading between the lines
- If similar tightness patterns appear for K greater than two, exact K-means could become practical without combinatorial search.
- The counterexample family indicates that any general tightness proof must exclude certain geometric configurations of the point set.
- The same relaxation technique could be tested on related partitioning problems such as balanced graph cuts.
Load-bearing premise
The data points satisfy the sufficient conditions that make the LP solution feasible for the integer K-means formulation.
What would settle it
A data set of moderate size on which the LP optimum violates at least one of the original K-means assignment constraints.
Figures
read the original abstract
In [SIAM J. Optim., 2022], the authors introduced a new linear programming (LP) relaxation for K-means clustering. In this paper, we further investigate both theoretical and computational properties of this relaxation. As evident from our numerical experiments with both synthetic real-world data sets, the proposed LP relaxation is almost always tight; i.e. its optimal solution is feasible for the original nonconvex problem. To better understand this unexpected behaviour, on the theoretical side, we focus on K-means clustering with two clusters, and we obtain sufficient conditions under which the LP relaxation is tight. We further analyze the sufficient conditions when the input is generated according to a popular stochastic model and obtain recovery guarantees for the LP relaxation. We conclude our theoretical study by constructing a family of inputs for which the LP relaxation is never tight. Denoting by $n$ the number of data points to be clustered, the LP relaxation contains $\Omega(n^3)$ inequalities making it impractical for large data sets. To address the scalability issue, by building upon a cutting-plane algorithm together with the GPU implementation of PDLP, a first-order method LP solver, we develop an efficient algorithm that solves the proposed LP and hence the K-means clustering problem, for up to $n \leq 4000$ data points.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript examines the LP relaxation for K-means clustering introduced in the 2022 SIAM J. Optim. paper. For K=2 it derives sufficient conditions guaranteeing tightness of the relaxation, obtains recovery guarantees when data follows a standard stochastic model, and constructs an explicit family of instances where the relaxation is never tight. It further develops a cutting-plane algorithm paired with a GPU implementation of the PDLP first-order solver to handle the Ω(n³) inequality formulation up to n=4000. Numerical experiments on synthetic and real-world data are presented to support the claim that the relaxation is almost always tight on the tested instances.
Significance. If the reported empirical tightness generalizes, the work shows that an LP relaxation can recover exact K-means solutions for moderate n, backed by explicit sufficient conditions and recovery results for the two-cluster case. The scalable solver directly addresses the cubic-size barrier identified in the formulation, and the construction of counterexamples clarifies the limits of the approach. These elements together provide both theoretical insight and a practical computational tool for the K-means problem.
major comments (1)
- [Numerical experiments / abstract] Numerical experiments (as summarized in the abstract and described in the experimental section): the central claim that the LP relaxation is 'almost always tight' is supported solely by experiments on unspecified synthetic and real-world data sets up to n=4000. No details are given on the precise collection of instances, exclusion rules, number of random trials, or statistical measures such as error bars, which makes it difficult to assess the robustness of the 'almost always' assertion that underpins the practical significance of the relaxation.
minor comments (2)
- [Abstract] Abstract: the phrase 'synthetic real-world data sets' is missing the conjunction 'and'.
- [Scalability / algorithmic section] The manuscript correctly identifies the Ω(n³) inequality count as the scalability bottleneck and addresses it via cutting planes; however, the description of the cutting-plane implementation could benefit from a brief pseudocode outline or reference to the specific separation routine used.
Simulated Author's Rebuttal
We thank the referee for the detailed review and constructive suggestion regarding the experimental section. We agree that additional transparency will strengthen the manuscript and will revise accordingly.
read point-by-point responses
-
Referee: [Numerical experiments / abstract] Numerical experiments (as summarized in the abstract and described in the experimental section): the central claim that the LP relaxation is 'almost always tight' is supported solely by experiments on unspecified synthetic and real-world data sets up to n=4000. No details are given on the precise collection of instances, exclusion rules, number of random trials, or statistical measures such as error bars, which makes it difficult to assess the robustness of the 'almost always' assertion that underpins the practical significance of the relaxation.
Authors: We agree with the referee that the current description of the numerical experiments lacks sufficient detail. In the revised manuscript we will expand Section 5 (and the abstract if needed) to specify: (i) the exact generation procedure and parameter ranges for all synthetic instances, (ii) the sources and preprocessing steps for the real-world data sets, (iii) any exclusion or filtering rules applied, (iv) the total number of random trials performed for each experiment, and (v) statistical measures (means, standard deviations, or error bars) for the reported tightness frequencies. These additions will allow readers to evaluate the robustness of the empirical claim. revision: yes
Circularity Check
Minor self-citation to 2022 LP formulation; all new results derived independently
full rationale
The paper cites its own prior SIAM J. Optim. 2022 work solely to define the LP relaxation being analyzed. All load-bearing contributions—sufficient conditions for tightness when K=2, recovery guarantees under the stochastic model, explicit counterexample family, and the empirical 'almost always tight' observation—are obtained from first-principles derivations or direct numerical experiments on the given instances. No new prediction reduces to a fitted parameter, self-defined quantity, or unverified self-citation chain; the central claims remain externally falsifiable via the stated conditions and tested data sets up to n=4000.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The LP relaxation polytope contains the convex hull of feasible K-means assignments.
Forward citations
Cited by 2 Pith papers
-
Accessible Complexity Bounds for Restarted PDHG on Linear Programs with a Unique Optimizer
Derives accessible O(κΦ ln(κΦ ||w*||/ε)) iteration bound for rPDHG on unique-optima LPs, with computable Φ, two-stage performance, and equivalence to stability and sharpness.
-
From Sequential Nodes to GPU Batches: Parallel Branch and Bound for Optimal $k$-Sparse GLMs
A modular CPU-GPU batching framework for branch-and-bound delivers 10-100x speedups with zero optimality gap when certifying optimal cardinality-constrained GLMs.
Reference graph
Works this paper leans on
-
[1]
http://docs.mosek.com/9.0/faq.pdf
MOSEK 9.2 , 2019. http://docs.mosek.com/9.0/faq.pdf
work page 2019
-
[2]
P. Abdalla and A. S. Bandeira. Community detection with a subsampled semidefinite program. Sampling Theory, Signal Processing, and Data Analysis , 20(1):6, 2022
work page 2022
- [3]
-
[4]
D. Applegate, M. D ´ ıaz, O. Hinder, H. Lu, M. Lubin, B. O’Do noghue, and W. Schudy. Practical large-scale linear programming using primal-dual hybrid gradient. Advances in Neural Information Processing Systems , 34:20243–20257, 2021
work page 2021
-
[5]
P. Awasthi, A. S. Bandeira, M. Charikar, R. Krishnaswamy , S. Villar, and R. Ward. Relax, no need to round: Integrality of clustering formulations. Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, 165:191–200, 2015
work page 2015
- [6]
-
[7]
L. Dagum and R. Menon. OpenMP: an industry standard api fo r shared-memory programming. IEEE compu- tational science and engineering , 5(1):46–55, 1998
work page 1998
-
[8]
A. De Rosa and A. Khajavirad. The ratio-cut polytope and K -means clustering. SIAM Journal on Optimization , 32(1):173–203, 2022
work page 2022
-
[9]
D. Dua, C. Graff, et al. UCI machine learning repository. URL http://archive. ics. uci. edu/ml , 2017
work page 2017
-
[10]
Z. Friggstad, M. Rezapour, and Salavatipour M. R. Local search yields a PTAS for K-means in doubling metrics. SIAM Journal on Computing , 48(2):452–480, 2019
work page 2019
-
[11]
Gurobi Optimizer Reference Manual, 2021
Gurobi Optimization, LLC. Gurobi Optimizer Reference Manual, 2021. URL: https://www.gurobi.com
work page 2021
- [12]
-
[13]
T. Kanungo, D.M. Mount, N.S. Netanyahu, C.D. Piatko, R. Silverman, and A.Y. Wu. A local search approxi- mation algorithm for K-means clustering. Proceedings of the 18th Annual ACM Symposium on Computation al Geometry, pages 10–18, 2002
work page 2002
-
[14]
A. Khajavirad and N. V. Sahinidis. A hybrid LP/NLP parad igm for global optimization relaxations. Mathe- matical Programming Computation , 10:383–421, 2018. doi:10.1007/s12532-018-0138-5
-
[15]
X. Li, Y. Li, S. Ling, T. Strohmer, and K. Wei. When do bird s of a feather flock together? K-means, proximity, and conic programming. Mathematical Programming, 179:295–341, 2020
work page 2020
-
[16]
S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory , 28(2):129 –137, 1982
work page 1982
- [17]
- [18]
-
[19]
M. Mahajan, P. Nimbhorkar, and K. Varadarajan. The plan ar K-means problem is NP-hard. In WALCOM: Algorithms and Computation , pages 274–285. Springer Berlin Heidelberg, 2009
work page 2009
-
[20]
O.L. Mangasarian. Uniqueness of solution in linear pro gramming. Linear Algebra and its Applications , 25:151– 162, 1979
work page 1979
- [21]
-
[22]
D. G. Mixon, S. Villar, and R. Ward. Clustering subgauss ian mixtures by semidefinite programming. Information and Inference: A Journal of the IMA , 6(4):389–415, 2017
work page 2017
-
[23]
D. G. Mixon and K. Xie. Sketching semidefinite programs f or faster clustering. IEEE Transactions on Informa- tion Theory , 67(10):6832–6840, 2021
work page 2021
-
[24]
A. Neumaier and O. Shcherbina. Safe bounds in linear and mixed-integer linear programming. Mathematical Programming, 99:283–296, 2004. 35
work page 2004
-
[25]
Approximating K-means-type clusterin g via semidefinite programming
J Peng and Y Wei. Approximating K-means-type clusterin g via semidefinite programming. SIAM Journal on Optimization, 18(1):186–205, 2007
work page 2007
-
[26]
J. Peng and Y. Xia. A New Theoretical Framework for K-Means-Type Clustering , pages 79–96. Springer Berlin Heidelberg, 2005
work page 2005
-
[27]
V. Piccialli, A. M. Sudoso, and A. Wiegele. SOS-SDP: An e xact solver for minimum sum-of-squares clustering. INFORMS Journal on Computing , 34(4):2144–2162, 2022
work page 2022
-
[28]
M. N. Prasad and G. A. Hanasusanto. Improved conic refor mulations for K-means clustering. SIAM Journal on Optimization , 28(4):3105–3126, 2018
work page 2018
- [29]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.