On the sequential convergence of Lloyd's algorithms
Pith reviewed 2026-05-24 00:55 UTC · model grok-4.3
The pith
Lloyd's algorithm iterates converge sequentially to one point under an analytic density assumption.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Lloyd's algorithm is interpreted as a gradient method on the quantization functional given by optimal transport. Under the assumption that the target probability measure admits an analytic density or an analytic density restricted to a compact semi-algebraic set, the functional is definable in a log-analytic structure. This ensures the Kurdyka-Lojasiewicz property holds, which in turn yields sequential convergence of the iterates to a single accumulation point for both optimal quantization with arbitrary discrete measures and uniform quantization with uniform discrete measures.
What carries the argument
The log-analytic character of globally subanalytic integrals, which transfers definability to the quantization functional and thereby activates Kurdyka-Lojasiewicz convergence for the gradient interpretation of Lloyd's method.
If this is right
- The iterates reach a single critical point instead of oscillating among several accumulation points.
- The convergence result covers both the case of arbitrary discrete measures and the uniform discrete-measure case.
- Definability holds for broader semi-discrete optimal transport losses, including general-cost transport, max-sliced Wasserstein distance, and entropy-regularized transport.
- The guarantees apply directly to practical settings such as analytic densities truncated to compact semi-algebraic domains.
Where Pith is reading between the lines
- The same definability route might be used to obtain convergence statements for other iterative optimal-transport algorithms beyond Lloyd's method.
- If analyticity can be weakened while preserving log-analytic integrability, sequential convergence could extend to a larger class of densities.
- In applications one could verify the analyticity condition on a given density to decide whether the theoretical guarantee applies.
Load-bearing premise
The target probability measure admits an analytic density, possibly restricted to a compact semi-algebraic set.
What would settle it
An explicit analytic density on a compact semi-algebraic set for which the sequence of Lloyd iterates possesses at least two distinct accumulation points would falsify the sequential-convergence claim.
Figures
read the original abstract
Lloyd's algorithm is an iterative method that solves the quantization problem, i.e. the approximation of a target probability measure by a discrete one, and is particularly used in digital applications. This algorithm can be interpreted as a gradient method on a certain quantization functional which is given by optimal transport. We study the sequential convergence (to a single accumulation point) for two variants of Lloyd's method: (i) optimal quantization with an arbitrary discrete measure and (ii) uniform quantization with a uniform discrete measure. For both cases, we prove sequential convergence of the iterates under an analiticity assumption on the density of the target measure. This includes for example analytic densities truncated to a compact semi-algebraic set. The argument leverages the log analytic nature of globally subanalytic integrals, the interpretation of Lloyd's method as a gradient method and the convergence analysis of gradient algorithms under Kurdyka-Lojasiewicz assumptions. As a by-product, we also obtain definability results for more general semi-discrete optimal transport losses such as transport distances with general costs, the max-sliced Wasserstein distance and the entropy regularized optimal transport loss.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves sequential convergence of the iterates for two variants of Lloyd's algorithm—optimal quantization with arbitrary discrete measures and uniform quantization with uniform discrete measures—under the assumption that the target probability measure has an analytic density (or an analytic density restricted to a compact semi-algebraic set). The argument interprets Lloyd iterates as gradient steps on an optimal-transport quantization functional, invokes the Kurdyka-Łojasiewicz inequality via definability of globally subanalytic integrals, and uses the log-analytic character of those integrals; as a byproduct it establishes definability for several semi-discrete OT losses including max-sliced Wasserstein and entropy-regularized transport.
Significance. If the proofs are correct, the result supplies rigorous sequential-convergence guarantees for a classical algorithm under a broad and practically relevant analyticity hypothesis, while the definability corollaries enlarge the set of OT functionals known to lie in o-minimal structures. These contributions are of clear interest to the quantization, clustering, and optimal-transport communities.
major comments (2)
- [§3] §3 (gradient interpretation) and the subsequent KL analysis: the manuscript invokes external theorems on gradient methods under KL assumptions, but does not explicitly verify that the quantization functional satisfies the required desingularizing function with exponent θ<1 when the density is merely analytic on a semi-algebraic set; a short self-contained check or reference to the precise exponent obtained from the log-analytic integral would remove any ambiguity.
- [Theorem 4.1] Theorem 4.1 (sequential convergence for optimal quantization): the passage from definability of the subanalytic integral to the KL inequality is stated as a direct consequence of the log-analytic character, yet the precise o-minimal structure and the resulting exponent are not recorded; without this datum it is impossible to confirm that the convergence rate implied by the KL property is compatible with the claimed sequential (rather than merely subsequential) convergence.
minor comments (2)
- [Abstract] Abstract: 'analiticity' is a typographical error for 'analyticity'.
- [Theorem 4.2] The statement of the uniform-quantization result (Theorem 4.2) repeats the same definability argument as the optimal case; a one-sentence remark on any structural differences would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful reading, positive evaluation, and constructive suggestions. We address the two major comments below and will incorporate clarifications in a revised version.
read point-by-point responses
-
Referee: [§3] §3 (gradient interpretation) and the subsequent KL analysis: the manuscript invokes external theorems on gradient methods under KL assumptions, but does not explicitly verify that the quantization functional satisfies the required desingularizing function with exponent θ<1 when the density is merely analytic on a semi-algebraic set; a short self-contained check or reference to the precise exponent obtained from the log-analytic integral would remove any ambiguity.
Authors: We agree that an explicit reference to the exponent would remove ambiguity. The log-analytic character of the globally subanalytic integrals (as established in the paper via the analytic density on a compact semi-algebraic set) yields a desingularizing function with exponent θ = 1/2, which is strictly less than 1 and compatible with the cited gradient-method theorems. In the revision we will add a short paragraph in §3 with this self-contained check and a pointer to the relevant property of log-analytic functions in the o-minimal structure of globally subanalytic sets. revision: yes
-
Referee: [Theorem 4.1] Theorem 4.1 (sequential convergence for optimal quantization): the passage from definability of the subanalytic integral to the KL inequality is stated as a direct consequence of the log-analytic character, yet the precise o-minimal structure and the resulting exponent are not recorded; without this datum it is impossible to confirm that the convergence rate implied by the KL property is compatible with the claimed sequential (rather than merely subsequential) convergence.
Authors: The o-minimal structure employed is that of globally subanalytic sets, and the log-analytic character directly supplies the KL inequality with exponent θ = 1/2. This exponent guarantees the sequential convergence claimed in Theorem 4.1 via the standard results on gradient flows under KL assumptions. We will record these two pieces of information explicitly in the statement and proof of Theorem 4.1 in the revised manuscript. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper establishes a conditional theorem: sequential convergence of two Lloyd variants holds when the target measure has an analytic density (or analytic density on a compact semi-algebraic set). The proof chain interprets Lloyd iterates as a gradient method, invokes the Kurdyka-Łojasiewicz inequality via definability of globally subanalytic integrals, and applies known convergence results for gradient algorithms under KL assumptions. These are external, independently established facts from real algebraic geometry and optimization theory; the derivation does not reduce any claimed prediction or uniqueness result to a quantity defined from the paper's own fitted parameters, self-citations, or ansatzes. The by-product definability statements for other OT losses are likewise derived from the same external machinery rather than circularly from the convergence claim itself.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The quantization functional satisfies the Kurdyka-Lojasiewicz inequality
- standard math Globally subanalytic integrals are log-analytic
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We prove sequential convergence of the iterates under an analyticity assumption on the density of the target measure... leverages the log analytic nature of globally subanalytic integrals... Kurdyka-Łojasiewicz assumptions.
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the log analytic nature of globally subanalytic integrals... definability results for more general semi-discrete optimal transport losses
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Theorem 2.3... Theorem 2.5... KL function on (Rd)^N
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 1 Pith paper
-
Weighted quantization using MMD: From mean field to mean shift via gradient flows
Derives MSIP algorithm from MMD gradient flows for weighted quantization, extending mean shift and relating to preconditioned gradient descent and Lloyd's clustering.
Reference graph
Works this paper leans on
- [1]
-
[2]
An introduction to self-organizing maps
Umut Asan and Secil Ercan. An introduction to self-organizing maps. In Computational intelligence systems in industrial engineering: With recent theory and applications, pages 295–315. Springer, 2012. 23
work page 2012
-
[3]
H. Attouch and J. Bolte. On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program., 116:5–16, 2009
work page 2009
-
[4]
H. Attouch, J. Bolte, P. Redont, and A. Soubeyran. Proximal alternating minimization and projec- tion methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Mathematics of Operations Research, 35(2):438–457, 2010
work page 2010
-
[5]
H. Attouch, J. Bolte, and B.F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Mathematical Programming, 137(1):91–129, 2013
work page 2013
- [6]
- [7]
-
[8]
S. Bobkov and M. Ledoux. One-dimensional empirical measures, order statistics, and Kantorovich transport distances. American Mathematical Society, 261(1259), 2019
work page 2019
- [9]
- [10]
- [11]
- [12]
- [13]
-
[14]
N. Bonneel, J. Rabin, G. Peyr ´e, and H. Pfister. Sliced and Radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 1(51):22–45, 2015
work page 2015
-
[15]
L. Bottou and Y . Bengio. Convergence properties of the k-means algorithms. Advances in neural information processing systems, 7, 1994
work page 1994
- [16]
-
[17]
C. Bouton and G. Pag `es. About the multidimensional competitive learning vector quantization algo- rithm with constant gain. The Annals of Applied Probability, pages 679–710, 1997
work page 1997
-
[18]
R. Cluckers and D.J. Miller. Stability under integration of sums of products of real globally subanalytic functions and their logarithms. Duke Mathematical Journal, 156(2):311 – 348, 2011
work page 2011
-
[19]
Comte, J-M Lion, and J-P Rolin
G. Comte, J-M Lion, and J-P Rolin. Nature log-analytique du volume des sous-analytiques. Illinois Journal of Mathematics, 44(4):884–888, 2000. 24
work page 2000
-
[20]
M. Coste. An introduction to o-minimal geometry . Istituti editoriali e poligrafici internazionali Pisa, 2000
work page 2000
-
[21]
Self-organizing maps, theory and applications
Marie Cottrell, Madalina Olteanu, Fabrice Rossi, and Nathalie N Villa-Vialaneix. Self-organizing maps, theory and applications. Revista de Investigacion Operacional, 39(1):1–22, 2018
work page 2018
-
[22]
M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013
work page 2013
-
[23]
F. De Goes, K. Breeden, V . Ostromoukhov, and M. Desbrun. Blue noise through optimal transport. ACM Transactions on Graphics (TOG), 31(6):1–11, 2012
work page 2012
-
[24]
F. de Gournay, J. Kahn, and L. Lebrat. Differentiation and regularity of semi-discrete optimal transport with respect to the parameters of the discrete measure. Numerische Mathematik, 141(2):429–453, 2019
work page 2019
-
[25]
I. Deshpande, Y .T. Hu, R. Sun, A. Pyrros, N. Siddiqui, S. Koyejo, Z. Zhao, D. Forsyth, and A. Schwing. Max-Sliced Wasserstein distance and its use for GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10648–10656, 2019
work page 2019
-
[26]
L. Dries and C. Miller. On the real exponential field with restricted analytic functions. Israel Journal of Mathematics, 92:427, 1995
work page 1995
-
[27]
L. Dries and C. Miller. Geometric categories and o-minimal structures. Duke Mathematical Journal, 84(2):497 – 540, 1996
work page 1996
-
[28]
Q. Du, M. Emelianenko, and L. Ju. Convergence of the Lloyd algorithm for computing centroidal Vorono¨ı tessellations. SIAM Journal on Numerical Analysis, 44(1):102–119, 2006
work page 2006
-
[29]
Q. Du, V . Faber, and M. Gunzburger. Centroidal Vorono ¨ı tessellations: Applications and algorithms. SIAM Review, 41(4):637–676, 1999
work page 1999
-
[30]
M. Emelianenko, L. Ju, and A. Rand. Nondegeneracy and weak global convergence of the Lloyd algorithm in Rd. SIAM Journal on Numerical Analysis, 46(3):1423–1441, 2008
work page 2008
- [31]
-
[32]
A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyr´e. Sample complexity of Sinkhorn divergences. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 1574–1583. PMLR, 2019
work page 2019
-
[33]
A. Genevay, M. Cuturi, G. Peyr ´e, and F. Bach. Stochastic optimization for large-scale optimal trans- port. In Advances in Neural Information Processing Systems , volume 29. Curran Associates, Inc., 2016
work page 2016
-
[34]
S. Graf and H. Luschgy. Foundations of Quantization for Probability Distributions , volume 1730. Springer Berlin, Heidelberg, 2000
work page 2000
-
[35]
A.D. Ioffe. Variational analysis of regular mappings. Springer Monographs in Mathematics. Springer, Cham, 2017. 25
work page 2017
-
[36]
Integration of semialgebraic functions and integrated nash functions
Tobias Kaiser. Integration of semialgebraic functions and integrated nash functions. Mathematische Zeitschrift, 275(1):349–366, 2013
work page 2013
-
[37]
J. Kieffer. Exponential rate of convergence for Lloyd’s method I. IEEE Transactions on Information Theory, 28(2):205–210, 1982
work page 1982
- [38]
-
[39]
Q. Kitagawa, J. M ´erigot and B. Thibert. Convergence of a Newton algorithm for semi-discrete optimal transport. Journal of the European Mathematical Society, 21, 2019
work page 2019
-
[40]
Self-organizing maps: Ophmization approaches
Teuvo Kohonen. Self-organizing maps: Ophmization approaches. In Artificial neural networks, pages 981–990. Elsevier, 1991
work page 1991
-
[41]
S. Kolouri, K. Nadjahi, U. Simsekli, R. Badeau, and G. Rohde. Generalized Sliced Wasserstein dis- tances. Advances in neural information processing systems, 32, 2019
work page 2019
-
[42]
K. Kurdyka. On gradients of functions definable in o-minimal structures. Annales de l’Institut Fourier, 48(3):769–783, 1998
work page 1998
-
[43]
K. Kurdyka and A. Parusinski. Wf-stratification of subanalytic functions and the łojasiewicz inequality. Comptes rendus de l’Acad´emie des sciences. S´erie 1, Math´ematique, 1994
work page 1994
-
[44]
R. Larson. Optimum quantization in dynamic systems. IEEE Transactions on Automatic Control , 12(2):162–168, 1967
work page 1967
- [45]
-
[46]
J.-M. Lion and J.-P. Rolin. Int ´egration des fonctions sous-analytiques et volumes des sous-ensembles sous-analytiques. Annales de l’Institut Fourier, 48(3):755–767, 1998
work page 1998
-
[47]
T. Liu and B.F. Lourenc ¸o. Convergence analysis under consistent error bounds.Foundations of Com- putational Mathematics, 24(2):429–479, 2024
work page 2024
-
[48]
S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129– 137, 1982
work page 1982
-
[49]
T.L. Loi. Lecture 1: O-minimal structures. In The Japanese-Australian Workshop on Real and Com- plex Singularities: JARCS III, volume 43, pages 19–31. Australian National University, Mathematical Sciences Institute, 2010
work page 2010
-
[50]
S. Lojasiewicz. Une propri ´et´e topologique des sous-ensembles analytiques r ´eels. Les ´equations aux d´eriv´ees partielles, 117:87–89, 1963
work page 1963
-
[51]
Q. M ´erigot, F. Santambrogio, and C. Sarrazin. Non-asymptotic convergence bounds for Wasserstein approximation using point clouds. In Advances in Neural Information Processing Systems, volume 34, pages 12810–12821. Curran Associates, Inc., 2021
work page 2021
-
[52]
D. Miller. A preparation theorem for Weierstrass systems. Transactions of the american mathematical society, 358(10):4395–4439, 2006. 26
work page 2006
-
[53]
A. Opris. On preparation theorems for ran, exp-definable functions. Journal of Logic and Analysis , 15, 2021
work page 2021
-
[54]
G. Pag `es. Introduction to vector quantization and its applications for numerics. ESAIM: proceedings and surveys, 48:29–79, 2015
work page 2015
-
[55]
G. Pag `es and J. Yu. Pointwise convergence of the Lloyd algorithm in higher dimension.SIAM Journal on Control and Optimization, 54(5):2354–2382, 2016
work page 2016
-
[56]
A space quantization method for numerical integration
Gilles Pag `es. A space quantization method for numerical integration. Journal of computational and applied mathematics, 89(1):1–38, 1998
work page 1998
-
[57]
A. Parusi ´nski. On the preparation theorem for subanalytic functions. In New developments in singu- larity theory, pages 193–215. Springer, 2001
work page 2001
-
[58]
G. Peyr ´e and M. Cuturi. Computational optimal transport: With applications to data science. Founda- tions and Trends® in Machine Learning, 11(5-6):355–607, 2019
work page 2019
- [59]
-
[60]
F. Santambrogio. Optimal transport for applied mathematicians. Birk¨auser, NY, 55(58-63):94, 2015
work page 2015
-
[61]
The Pfaffian closure of an o-minimal structure
Patrick Speissegger. The Pfaffian closure of an o-minimal structure. Journal f¨ur die reine und ange- wandte Mathematik, 1999(508):189–211, 1999
work page 1999
-
[62]
J. Z. Sun and V . K. Goyal. Optimal quantization of random measurements in compressed sensing. In 2009 IEEE International Symposium on Information Theory, pages 6–10. IEEE, 2009
work page 2009
-
[63]
E. Tanguy. Convergence of SGD for training neural networks with Sliced Wasserstein losses. Trans- actions on Machine Learning Research, 2023
work page 2023
-
[64]
L. Van den Dries. A generalization of the tarski-seidenberg theorem, and some nondefinability results. Bulletin of the American Mathematical Society, 15(2):189–193, 1986
work page 1986
-
[65]
L. van den Dries and P. Speissegger. O-minimal preparation theorems. Model theory and applications, 11:87–116, 2002
work page 2002
-
[66]
C. Villani. Optimal Transport: Old and New , volume 338 of Grundlehren der mathematischen Wis- senschaften. Springer Berlin Heidelberg, 2008
work page 2008
-
[67]
C. Villani. Topics in optimal transportation, volume 58. American Mathematical Soc., 2021
work page 2021
-
[68]
X. Wu. On convergence of Lloyd’s method I. IEEE Transactions on Information Theory, 38(1):171– 174, 1992
work page 1992
-
[69]
Y . Zhou, S.-M. Moosavi-Dezfooli, N.-M. Cheung, and P. Frossard. Adaptive quantization for deep neural network. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 2018. 27
work page 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.