On the sequential convergence of Lloyd's algorithms

Edouard Pauwels; Elsa Cazelles; L\'eo Portales

arxiv: 2405.20744 · v5 · pith:RVSZZML6new · submitted 2024-05-31 · 🧮 math.OC

On the sequential convergence of Lloyd's algorithms

L\'eo Portales , Elsa Cazelles , Edouard Pauwels This is my paper

Pith reviewed 2026-05-24 00:55 UTC · model grok-4.3

classification 🧮 math.OC

keywords Lloyd's algorithmsequential convergencequantizationoptimal transportanalytic densityKurdyka-Lojasiewicz inequalitysubanalytic integralssemi-discrete optimal transport

0 comments

The pith

Lloyd's algorithm iterates converge sequentially to one point under an analytic density assumption.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that the iterates of two variants of Lloyd's algorithm for quantization converge to a single accumulation point when the target probability measure has an analytic density, possibly restricted to a compact semi-algebraic set. It does so by recasting Lloyd's method as a gradient algorithm on a quantization functional derived from optimal transport, then invoking convergence theory for methods satisfying the Kurdyka-Lojasiewicz inequality, which follows from the log-analytic character of globally subanalytic integrals. A reader would care because Lloyd's algorithm is a standard tool in digital signal processing and data clustering, where guarantees against oscillation among multiple limits improve reliability. The argument also produces definability statements for several other semi-discrete optimal transport losses.

Core claim

Lloyd's algorithm is interpreted as a gradient method on the quantization functional given by optimal transport. Under the assumption that the target probability measure admits an analytic density or an analytic density restricted to a compact semi-algebraic set, the functional is definable in a log-analytic structure. This ensures the Kurdyka-Lojasiewicz property holds, which in turn yields sequential convergence of the iterates to a single accumulation point for both optimal quantization with arbitrary discrete measures and uniform quantization with uniform discrete measures.

What carries the argument

The log-analytic character of globally subanalytic integrals, which transfers definability to the quantization functional and thereby activates Kurdyka-Lojasiewicz convergence for the gradient interpretation of Lloyd's method.

If this is right

The iterates reach a single critical point instead of oscillating among several accumulation points.
The convergence result covers both the case of arbitrary discrete measures and the uniform discrete-measure case.
Definability holds for broader semi-discrete optimal transport losses, including general-cost transport, max-sliced Wasserstein distance, and entropy-regularized transport.
The guarantees apply directly to practical settings such as analytic densities truncated to compact semi-algebraic domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same definability route might be used to obtain convergence statements for other iterative optimal-transport algorithms beyond Lloyd's method.
If analyticity can be weakened while preserving log-analytic integrability, sequential convergence could extend to a larger class of densities.
In applications one could verify the analyticity condition on a given density to decide whether the theoretical guarantee applies.

Load-bearing premise

The target probability measure admits an analytic density, possibly restricted to a compact semi-algebraic set.

What would settle it

An explicit analytic density on a compact semi-algebraic set for which the sequence of Lloyd iterates possesses at least two distinct accumulation points would falsify the sequential-convergence claim.

Figures

Figures reproduced from arXiv: 2405.20744 by Edouard Pauwels, Elsa Cazelles, L\'eo Portales.

**Figure 1.** Figure 1: (Left) Target Gaussian mixture µ with two components truncated on a disk. (Middle) Optimal quantization of µ with 20 points (blue) and their corresponding Voronoi cells (in red) after 250 iterations of Lloyd’s algorithm. (Right) Uniform quantization of µ with 20 points (blue) and the corresponding power cells (in red) after 5 iterations of Lloyd’s algorithm adjusted for uniform quantization. The diameter o… view at source ↗

read the original abstract

Lloyd's algorithm is an iterative method that solves the quantization problem, i.e. the approximation of a target probability measure by a discrete one, and is particularly used in digital applications. This algorithm can be interpreted as a gradient method on a certain quantization functional which is given by optimal transport. We study the sequential convergence (to a single accumulation point) for two variants of Lloyd's method: (i) optimal quantization with an arbitrary discrete measure and (ii) uniform quantization with a uniform discrete measure. For both cases, we prove sequential convergence of the iterates under an analiticity assumption on the density of the target measure. This includes for example analytic densities truncated to a compact semi-algebraic set. The argument leverages the log analytic nature of globally subanalytic integrals, the interpretation of Lloyd's method as a gradient method and the convergence analysis of gradient algorithms under Kurdyka-Lojasiewicz assumptions. As a by-product, we also obtain definability results for more general semi-discrete optimal transport losses such as transport distances with general costs, the max-sliced Wasserstein distance and the entropy regularized optimal transport loss.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves sequential convergence for two Lloyd variants when the target measure has an analytic density, via OT gradient flow and KL under subanalyticity, plus some definability side results.

read the letter

Lloyd's algorithm gets sequential convergence to a single point under an analytic density assumption on the target measure. This covers both the general discrete quantization case and uniform quantization, with the proof route running through an optimal transport interpretation of the iterates as gradient steps, followed by Kurdyka-Lojasiewicz analysis once definability is secured via globally subanalytic integrals.

Referee Report

2 major / 2 minor

Summary. The manuscript proves sequential convergence of the iterates for two variants of Lloyd's algorithm—optimal quantization with arbitrary discrete measures and uniform quantization with uniform discrete measures—under the assumption that the target probability measure has an analytic density (or an analytic density restricted to a compact semi-algebraic set). The argument interprets Lloyd iterates as gradient steps on an optimal-transport quantization functional, invokes the Kurdyka-Łojasiewicz inequality via definability of globally subanalytic integrals, and uses the log-analytic character of those integrals; as a byproduct it establishes definability for several semi-discrete OT losses including max-sliced Wasserstein and entropy-regularized transport.

Significance. If the proofs are correct, the result supplies rigorous sequential-convergence guarantees for a classical algorithm under a broad and practically relevant analyticity hypothesis, while the definability corollaries enlarge the set of OT functionals known to lie in o-minimal structures. These contributions are of clear interest to the quantization, clustering, and optimal-transport communities.

major comments (2)

[§3] §3 (gradient interpretation) and the subsequent KL analysis: the manuscript invokes external theorems on gradient methods under KL assumptions, but does not explicitly verify that the quantization functional satisfies the required desingularizing function with exponent θ<1 when the density is merely analytic on a semi-algebraic set; a short self-contained check or reference to the precise exponent obtained from the log-analytic integral would remove any ambiguity.
[Theorem 4.1] Theorem 4.1 (sequential convergence for optimal quantization): the passage from definability of the subanalytic integral to the KL inequality is stated as a direct consequence of the log-analytic character, yet the precise o-minimal structure and the resulting exponent are not recorded; without this datum it is impossible to confirm that the convergence rate implied by the KL property is compatible with the claimed sequential (rather than merely subsequential) convergence.

minor comments (2)

[Abstract] Abstract: 'analiticity' is a typographical error for 'analyticity'.
[Theorem 4.2] The statement of the uniform-quantization result (Theorem 4.2) repeats the same definability argument as the optimal case; a one-sentence remark on any structural differences would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading, positive evaluation, and constructive suggestions. We address the two major comments below and will incorporate clarifications in a revised version.

read point-by-point responses

Referee: [§3] §3 (gradient interpretation) and the subsequent KL analysis: the manuscript invokes external theorems on gradient methods under KL assumptions, but does not explicitly verify that the quantization functional satisfies the required desingularizing function with exponent θ<1 when the density is merely analytic on a semi-algebraic set; a short self-contained check or reference to the precise exponent obtained from the log-analytic integral would remove any ambiguity.

Authors: We agree that an explicit reference to the exponent would remove ambiguity. The log-analytic character of the globally subanalytic integrals (as established in the paper via the analytic density on a compact semi-algebraic set) yields a desingularizing function with exponent θ = 1/2, which is strictly less than 1 and compatible with the cited gradient-method theorems. In the revision we will add a short paragraph in §3 with this self-contained check and a pointer to the relevant property of log-analytic functions in the o-minimal structure of globally subanalytic sets. revision: yes
Referee: [Theorem 4.1] Theorem 4.1 (sequential convergence for optimal quantization): the passage from definability of the subanalytic integral to the KL inequality is stated as a direct consequence of the log-analytic character, yet the precise o-minimal structure and the resulting exponent are not recorded; without this datum it is impossible to confirm that the convergence rate implied by the KL property is compatible with the claimed sequential (rather than merely subsequential) convergence.

Authors: The o-minimal structure employed is that of globally subanalytic sets, and the log-analytic character directly supplies the KL inequality with exponent θ = 1/2. This exponent guarantees the sequential convergence claimed in Theorem 4.1 via the standard results on gradient flows under KL assumptions. We will record these two pieces of information explicitly in the statement and proof of Theorem 4.1 in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper establishes a conditional theorem: sequential convergence of two Lloyd variants holds when the target measure has an analytic density (or analytic density on a compact semi-algebraic set). The proof chain interprets Lloyd iterates as a gradient method, invokes the Kurdyka-Łojasiewicz inequality via definability of globally subanalytic integrals, and applies known convergence results for gradient algorithms under KL assumptions. These are external, independently established facts from real algebraic geometry and optimization theory; the derivation does not reduce any claimed prediction or uniqueness result to a quantity defined from the paper's own fitted parameters, self-citations, or ansatzes. The by-product definability statements for other OT losses are likewise derived from the same external machinery rather than circularly from the convergence claim itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on two external mathematical properties: the Kurdyka-Lojasiewicz inequality for the quantization functional and the log-analytic character of globally subanalytic integrals. No free parameters or invented entities are introduced.

axioms (2)

domain assumption The quantization functional satisfies the Kurdyka-Lojasiewicz inequality
Invoked to obtain sequential convergence from the gradient-method interpretation of Lloyd's algorithm.
standard math Globally subanalytic integrals are log-analytic
Used to establish the definability results for the semi-discrete optimal transport losses.

pith-pipeline@v0.9.0 · 5726 in / 1326 out tokens · 21854 ms · 2026-05-24T00:55:11.380426+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We prove sequential convergence of the iterates under an analyticity assumption on the density of the target measure... leverages the log analytic nature of globally subanalytic integrals... Kurdyka-Łojasiewicz assumptions.
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the log analytic nature of globally subanalytic integrals... definability results for more general semi-discrete optimal transport losses
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 2.3... Theorem 2.5... KL function on (Rd)^N

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Weighted quantization using MMD: From mean field to mean shift via gradient flows
stat.ML 2025-02 unverdicted novelty 6.0

Derives MSIP algorithm from MMD gradient flows for weighted quantization, extending mean shift and relating to preconditioned gradient descent and Lloyd's clustering.

Reference graph

Works this paper leans on

69 extracted references · 69 canonical work pages · cited by 1 Pith paper

[1]

Absil, R

P-A. Absil, R. Mahony, and B. Andrews. Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim., 6(2):531–547, 2005

work page 2005
[2]

An introduction to self-organizing maps

Umut Asan and Secil Ercan. An introduction to self-organizing maps. In Computational intelligence systems in industrial engineering: With recent theory and applications, pages 295–315. Springer, 2012. 23

work page 2012
[3]

Attouch and J

H. Attouch and J. Bolte. On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program., 116:5–16, 2009

work page 2009
[4]

Attouch, J

H. Attouch, J. Bolte, P. Redont, and A. Soubeyran. Proximal alternating minimization and projec- tion methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Mathematics of Operations Research, 35(2):438–457, 2010

work page 2010
[5]

Attouch, J

H. Attouch, J. Bolte, and B.F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Mathematical Programming, 137(1):91–129, 2013

work page 2013
[6]

Ayanoglu

E. Ayanoglu. On optimal quantization of noisy sources. IEEE Transactions on Information Theory , 36(6):1450–1452, 1990

work page 1990
[7]

Balzer, T

M. Balzer, T. Schl¨omer, and O. Deussen. Capacity-constrained point distributions: A variant of Lloyd’s method. ACM Transactions on Graphics (TOG), 28(3):1–8, 2009

work page 2009
[8]

Bobkov and M

S. Bobkov and M. Ledoux. One-dimensional empirical measures, order statistics, and Kantorovich transport distances. American Mathematical Society, 261(1259), 2019

work page 2019
[9]

Bogachev

V . Bogachev. Measure Theory, volume 1. Springer Berlin, Heidelberg, 2007

work page 2007
[10]

Bolte, A

J. Bolte, A. Daniilidis, and A. Lewis. The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM Journal on Optimization, 2007

work page 2007
[11]

Bolte, T

J. Bolte, T. Le, and E. Pauwels. Subgradient sampling for nonsmooth nonconvex minimization. SIAM Journal on Optimization, 33(4):2542–2569, 2023

work page 2023
[12]

Bolte, T

J. Bolte, T. P. Nguyen, J. Peypouquet, and B. W. Suter. From error bounds to the complexity of first-order descent methods for convex functions. Mathematical Programming, 165:471–507, 2017

work page 2017
[13]

Bolte, S

J. Bolte, S. Sabach, and M. Teboulle. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146(1-2):459–494, 2014

work page 2014
[14]

Bonneel, J

N. Bonneel, J. Rabin, G. Peyr ´e, and H. Pfister. Sliced and Radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 1(51):22–45, 2015

work page 2015
[15]

Bottou and Y

L. Bottou and Y . Bengio. Convergence properties of the k-means algorithms. Advances in neural information processing systems, 7, 1994

work page 1994
[16]

Bourne, M

D. Bourne, M. Peletier, and S. Roper. Hexagonal patterns in a simplified model for block copolymers. SIAM Journal on Applied Mathematics, 74:1315–1337, 2014

work page 2014
[17]

Bouton and G

C. Bouton and G. Pag `es. About the multidimensional competitive learning vector quantization algo- rithm with constant gain. The Annals of Applied Probability, pages 679–710, 1997

work page 1997
[18]

Cluckers and D.J

R. Cluckers and D.J. Miller. Stability under integration of sums of products of real globally subanalytic functions and their logarithms. Duke Mathematical Journal, 156(2):311 – 348, 2011

work page 2011
[19]

Comte, J-M Lion, and J-P Rolin

G. Comte, J-M Lion, and J-P Rolin. Nature log-analytique du volume des sous-analytiques. Illinois Journal of Mathematics, 44(4):884–888, 2000. 24

work page 2000
[20]

M. Coste. An introduction to o-minimal geometry . Istituti editoriali e poligrafici internazionali Pisa, 2000

work page 2000
[21]

Self-organizing maps, theory and applications

Marie Cottrell, Madalina Olteanu, Fabrice Rossi, and Nathalie N Villa-Vialaneix. Self-organizing maps, theory and applications. Revista de Investigacion Operacional, 39(1):1–22, 2018

work page 2018
[22]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013

work page 2013
[23]

De Goes, K

F. De Goes, K. Breeden, V . Ostromoukhov, and M. Desbrun. Blue noise through optimal transport. ACM Transactions on Graphics (TOG), 31(6):1–11, 2012

work page 2012
[24]

de Gournay, J

F. de Gournay, J. Kahn, and L. Lebrat. Differentiation and regularity of semi-discrete optimal transport with respect to the parameters of the discrete measure. Numerische Mathematik, 141(2):429–453, 2019

work page 2019
[25]

Deshpande, Y .T

I. Deshpande, Y .T. Hu, R. Sun, A. Pyrros, N. Siddiqui, S. Koyejo, Z. Zhao, D. Forsyth, and A. Schwing. Max-Sliced Wasserstein distance and its use for GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10648–10656, 2019

work page 2019
[26]

Dries and C

L. Dries and C. Miller. On the real exponential field with restricted analytic functions. Israel Journal of Mathematics, 92:427, 1995

work page 1995
[27]

Dries and C

L. Dries and C. Miller. Geometric categories and o-minimal structures. Duke Mathematical Journal, 84(2):497 – 540, 1996

work page 1996
[28]

Q. Du, M. Emelianenko, and L. Ju. Convergence of the Lloyd algorithm for computing centroidal Vorono¨ı tessellations. SIAM Journal on Numerical Analysis, 44(1):102–119, 2006

work page 2006
[29]

Q. Du, V . Faber, and M. Gunzburger. Centroidal Vorono ¨ı tessellations: Applications and algorithms. SIAM Review, 41(4):637–676, 1999

work page 1999
[30]

Emelianenko, L

M. Emelianenko, L. Ju, and A. Rand. Nondegeneracy and weak global convergence of the Lloyd algorithm in Rd. SIAM Journal on Numerical Analysis, 46(3):1423–1441, 2008

work page 2008
[31]

Fleischer

P.E. Fleischer. Sufficient conditions for achieving minimum distortion in a quantizer. IEEE Int. Conv. Rec., 1, 1964

work page 1964
[32]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyr´e. Sample complexity of Sinkhorn divergences. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 1574–1583. PMLR, 2019

work page 2019
[33]

Genevay, M

A. Genevay, M. Cuturi, G. Peyr ´e, and F. Bach. Stochastic optimization for large-scale optimal trans- port. In Advances in Neural Information Processing Systems , volume 29. Curran Associates, Inc., 2016

work page 2016
[34]

Graf and H

S. Graf and H. Luschgy. Foundations of Quantization for Probability Distributions , volume 1730. Springer Berlin, Heidelberg, 2000

work page 2000
[35]

A.D. Ioffe. Variational analysis of regular mappings. Springer Monographs in Mathematics. Springer, Cham, 2017. 25

work page 2017
[36]

Integration of semialgebraic functions and integrated nash functions

Tobias Kaiser. Integration of semialgebraic functions and integrated nash functions. Mathematische Zeitschrift, 275(1):349–366, 2013

work page 2013
[37]

J. Kieffer. Exponential rate of convergence for Lloyd’s method I. IEEE Transactions on Information Theory, 28(2):205–210, 1982

work page 1982
[38]

Kipnis, G

A. Kipnis, G. Reeves, Y .C. Eldar, and A.J. Goldsmith. Compressed sensing under optimal quantization. In 2017 IEEE international symposium on information theory (ISIT), pages 2148–2152. IEEE, 2017

work page 2017
[39]

Kitagawa, J

Q. Kitagawa, J. M ´erigot and B. Thibert. Convergence of a Newton algorithm for semi-discrete optimal transport. Journal of the European Mathematical Society, 21, 2019

work page 2019
[40]

Self-organizing maps: Ophmization approaches

Teuvo Kohonen. Self-organizing maps: Ophmization approaches. In Artificial neural networks, pages 981–990. Elsevier, 1991

work page 1991
[41]

Kolouri, K

S. Kolouri, K. Nadjahi, U. Simsekli, R. Badeau, and G. Rohde. Generalized Sliced Wasserstein dis- tances. Advances in neural information processing systems, 32, 2019

work page 2019
[42]

K. Kurdyka. On gradients of functions definable in o-minimal structures. Annales de l’Institut Fourier, 48(3):769–783, 1998

work page 1998
[43]

Kurdyka and A

K. Kurdyka and A. Parusinski. Wf-stratification of subanalytic functions and the łojasiewicz inequality. Comptes rendus de l’Acad´emie des sciences. S´erie 1, Math´ematique, 1994

work page 1994
[44]

R. Larson. Optimum quantization in dynamic systems. IEEE Transactions on Automatic Control , 12(2):162–168, 1967

work page 1967
[45]

Lebrat, F

L. Lebrat, F. de Gournay, J. Kahn, and P. Weiss. Optimal transport approximation of 2-dimensional measures. SIAM Journal on Imaging Sciences, 12(2):762–787, 2019

work page 2019
[46]

Lion and J.-P

J.-M. Lion and J.-P. Rolin. Int ´egration des fonctions sous-analytiques et volumes des sous-ensembles sous-analytiques. Annales de l’Institut Fourier, 48(3):755–767, 1998

work page 1998
[47]

Liu and B.F

T. Liu and B.F. Lourenc ¸o. Convergence analysis under consistent error bounds.Foundations of Com- putational Mathematics, 24(2):429–479, 2024

work page 2024
[48]

S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129– 137, 1982

work page 1982
[49]

T.L. Loi. Lecture 1: O-minimal structures. In The Japanese-Australian Workshop on Real and Com- plex Singularities: JARCS III, volume 43, pages 19–31. Australian National University, Mathematical Sciences Institute, 2010

work page 2010
[50]

Lojasiewicz

S. Lojasiewicz. Une propri ´et´e topologique des sous-ensembles analytiques r ´eels. Les ´equations aux d´eriv´ees partielles, 117:87–89, 1963

work page 1963
[51]

M ´erigot, F

Q. M ´erigot, F. Santambrogio, and C. Sarrazin. Non-asymptotic convergence bounds for Wasserstein approximation using point clouds. In Advances in Neural Information Processing Systems, volume 34, pages 12810–12821. Curran Associates, Inc., 2021

work page 2021
[52]

D. Miller. A preparation theorem for Weierstrass systems. Transactions of the american mathematical society, 358(10):4395–4439, 2006. 26

work page 2006
[53]

A. Opris. On preparation theorems for ran, exp-definable functions. Journal of Logic and Analysis , 15, 2021

work page 2021
[54]

G. Pag `es. Introduction to vector quantization and its applications for numerics. ESAIM: proceedings and surveys, 48:29–79, 2015

work page 2015
[55]

Pag `es and J

G. Pag `es and J. Yu. Pointwise convergence of the Lloyd algorithm in higher dimension.SIAM Journal on Control and Optimization, 54(5):2354–2382, 2016

work page 2016
[56]

A space quantization method for numerical integration

Gilles Pag `es. A space quantization method for numerical integration. Journal of computational and applied mathematics, 89(1):1–38, 1998

work page 1998
[57]

Parusi ´nski

A. Parusi ´nski. On the preparation theorem for subanalytic functions. In New developments in singu- larity theory, pages 193–215. Springer, 2001

work page 2001
[58]

Peyr ´e and M

G. Peyr ´e and M. Cuturi. Computational optimal transport: With applications to data science. Founda- tions and Trends® in Machine Learning, 11(5-6):355–607, 2019

work page 2019
[59]

Rabin, G

J. Rabin, G. Peyr ´e, J. Delon, and M. Bernot. Wasserstein barycenter and its application to texture mixing. In Scale Space and Variational Methods in Computer Vision: Third International Conference, SSVM, pages 435–446. Springer, 2012

work page 2012
[60]

Santambrogio

F. Santambrogio. Optimal transport for applied mathematicians. Birk¨auser, NY, 55(58-63):94, 2015

work page 2015
[61]

The Pfaffian closure of an o-minimal structure

Patrick Speissegger. The Pfaffian closure of an o-minimal structure. Journal f¨ur die reine und ange- wandte Mathematik, 1999(508):189–211, 1999

work page 1999
[62]

J. Z. Sun and V . K. Goyal. Optimal quantization of random measurements in compressed sensing. In 2009 IEEE International Symposium on Information Theory, pages 6–10. IEEE, 2009

work page 2009
[63]

E. Tanguy. Convergence of SGD for training neural networks with Sliced Wasserstein losses. Trans- actions on Machine Learning Research, 2023

work page 2023
[64]

Van den Dries

L. Van den Dries. A generalization of the tarski-seidenberg theorem, and some nondefinability results. Bulletin of the American Mathematical Society, 15(2):189–193, 1986

work page 1986
[65]

van den Dries and P

L. van den Dries and P. Speissegger. O-minimal preparation theorems. Model theory and applications, 11:87–116, 2002

work page 2002
[66]

C. Villani. Optimal Transport: Old and New , volume 338 of Grundlehren der mathematischen Wis- senschaften. Springer Berlin Heidelberg, 2008

work page 2008
[67]

C. Villani. Topics in optimal transportation, volume 58. American Mathematical Soc., 2021

work page 2021
[68]

X. Wu. On convergence of Lloyd’s method I. IEEE Transactions on Information Theory, 38(1):171– 174, 1992

work page 1992
[69]

Zhou, S.-M

Y . Zhou, S.-M. Moosavi-Dezfooli, N.-M. Cheung, and P. Frossard. Adaptive quantization for deep neural network. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 2018. 27

work page 2018

[1] [1]

Absil, R

P-A. Absil, R. Mahony, and B. Andrews. Convergence of the iterates of descent methods for analytic cost functions. SIAM J. Optim., 6(2):531–547, 2005

work page 2005

[2] [2]

An introduction to self-organizing maps

Umut Asan and Secil Ercan. An introduction to self-organizing maps. In Computational intelligence systems in industrial engineering: With recent theory and applications, pages 295–315. Springer, 2012. 23

work page 2012

[3] [3]

Attouch and J

H. Attouch and J. Bolte. On the convergence of the proximal algorithm for nonsmooth functions involving analytic features. Math. Program., 116:5–16, 2009

work page 2009

[4] [4]

Attouch, J

H. Attouch, J. Bolte, P. Redont, and A. Soubeyran. Proximal alternating minimization and projec- tion methods for nonconvex problems: An approach based on the Kurdyka-Łojasiewicz inequality. Mathematics of Operations Research, 35(2):438–457, 2010

work page 2010

[5] [5]

Attouch, J

H. Attouch, J. Bolte, and B.F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Mathematical Programming, 137(1):91–129, 2013

work page 2013

[6] [6]

Ayanoglu

E. Ayanoglu. On optimal quantization of noisy sources. IEEE Transactions on Information Theory , 36(6):1450–1452, 1990

work page 1990

[7] [7]

Balzer, T

M. Balzer, T. Schl¨omer, and O. Deussen. Capacity-constrained point distributions: A variant of Lloyd’s method. ACM Transactions on Graphics (TOG), 28(3):1–8, 2009

work page 2009

[8] [8]

Bobkov and M

S. Bobkov and M. Ledoux. One-dimensional empirical measures, order statistics, and Kantorovich transport distances. American Mathematical Society, 261(1259), 2019

work page 2019

[9] [9]

Bogachev

V . Bogachev. Measure Theory, volume 1. Springer Berlin, Heidelberg, 2007

work page 2007

[10] [10]

Bolte, A

J. Bolte, A. Daniilidis, and A. Lewis. The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM Journal on Optimization, 2007

work page 2007

[11] [11]

Bolte, T

J. Bolte, T. Le, and E. Pauwels. Subgradient sampling for nonsmooth nonconvex minimization. SIAM Journal on Optimization, 33(4):2542–2569, 2023

work page 2023

[12] [12]

Bolte, T

J. Bolte, T. P. Nguyen, J. Peypouquet, and B. W. Suter. From error bounds to the complexity of first-order descent methods for convex functions. Mathematical Programming, 165:471–507, 2017

work page 2017

[13] [13]

Bolte, S

J. Bolte, S. Sabach, and M. Teboulle. Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming, 146(1-2):459–494, 2014

work page 2014

[14] [14]

Bonneel, J

N. Bonneel, J. Rabin, G. Peyr ´e, and H. Pfister. Sliced and Radon Wasserstein barycenters of measures. Journal of Mathematical Imaging and Vision, 1(51):22–45, 2015

work page 2015

[15] [15]

Bottou and Y

L. Bottou and Y . Bengio. Convergence properties of the k-means algorithms. Advances in neural information processing systems, 7, 1994

work page 1994

[16] [16]

Bourne, M

D. Bourne, M. Peletier, and S. Roper. Hexagonal patterns in a simplified model for block copolymers. SIAM Journal on Applied Mathematics, 74:1315–1337, 2014

work page 2014

[17] [17]

Bouton and G

C. Bouton and G. Pag `es. About the multidimensional competitive learning vector quantization algo- rithm with constant gain. The Annals of Applied Probability, pages 679–710, 1997

work page 1997

[18] [18]

Cluckers and D.J

R. Cluckers and D.J. Miller. Stability under integration of sums of products of real globally subanalytic functions and their logarithms. Duke Mathematical Journal, 156(2):311 – 348, 2011

work page 2011

[19] [19]

Comte, J-M Lion, and J-P Rolin

G. Comte, J-M Lion, and J-P Rolin. Nature log-analytique du volume des sous-analytiques. Illinois Journal of Mathematics, 44(4):884–888, 2000. 24

work page 2000

[20] [20]

M. Coste. An introduction to o-minimal geometry . Istituti editoriali e poligrafici internazionali Pisa, 2000

work page 2000

[21] [21]

Self-organizing maps, theory and applications

Marie Cottrell, Madalina Olteanu, Fabrice Rossi, and Nathalie N Villa-Vialaneix. Self-organizing maps, theory and applications. Revista de Investigacion Operacional, 39(1):1–22, 2018

work page 2018

[22] [22]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013

work page 2013

[23] [23]

De Goes, K

F. De Goes, K. Breeden, V . Ostromoukhov, and M. Desbrun. Blue noise through optimal transport. ACM Transactions on Graphics (TOG), 31(6):1–11, 2012

work page 2012

[24] [24]

de Gournay, J

F. de Gournay, J. Kahn, and L. Lebrat. Differentiation and regularity of semi-discrete optimal transport with respect to the parameters of the discrete measure. Numerische Mathematik, 141(2):429–453, 2019

work page 2019

[25] [25]

Deshpande, Y .T

I. Deshpande, Y .T. Hu, R. Sun, A. Pyrros, N. Siddiqui, S. Koyejo, Z. Zhao, D. Forsyth, and A. Schwing. Max-Sliced Wasserstein distance and its use for GANs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10648–10656, 2019

work page 2019

[26] [26]

Dries and C

L. Dries and C. Miller. On the real exponential field with restricted analytic functions. Israel Journal of Mathematics, 92:427, 1995

work page 1995

[27] [27]

Dries and C

L. Dries and C. Miller. Geometric categories and o-minimal structures. Duke Mathematical Journal, 84(2):497 – 540, 1996

work page 1996

[28] [28]

Q. Du, M. Emelianenko, and L. Ju. Convergence of the Lloyd algorithm for computing centroidal Vorono¨ı tessellations. SIAM Journal on Numerical Analysis, 44(1):102–119, 2006

work page 2006

[29] [29]

Q. Du, V . Faber, and M. Gunzburger. Centroidal Vorono ¨ı tessellations: Applications and algorithms. SIAM Review, 41(4):637–676, 1999

work page 1999

[30] [30]

Emelianenko, L

M. Emelianenko, L. Ju, and A. Rand. Nondegeneracy and weak global convergence of the Lloyd algorithm in Rd. SIAM Journal on Numerical Analysis, 46(3):1423–1441, 2008

work page 2008

[31] [31]

Fleischer

P.E. Fleischer. Sufficient conditions for achieving minimum distortion in a quantizer. IEEE Int. Conv. Rec., 1, 1964

work page 1964

[32] [32]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyr´e. Sample complexity of Sinkhorn divergences. In Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 of Proceedings of Machine Learning Research, pages 1574–1583. PMLR, 2019

work page 2019

[33] [33]

Genevay, M

A. Genevay, M. Cuturi, G. Peyr ´e, and F. Bach. Stochastic optimization for large-scale optimal trans- port. In Advances in Neural Information Processing Systems , volume 29. Curran Associates, Inc., 2016

work page 2016

[34] [34]

Graf and H

S. Graf and H. Luschgy. Foundations of Quantization for Probability Distributions , volume 1730. Springer Berlin, Heidelberg, 2000

work page 2000

[35] [35]

A.D. Ioffe. Variational analysis of regular mappings. Springer Monographs in Mathematics. Springer, Cham, 2017. 25

work page 2017

[36] [36]

Integration of semialgebraic functions and integrated nash functions

Tobias Kaiser. Integration of semialgebraic functions and integrated nash functions. Mathematische Zeitschrift, 275(1):349–366, 2013

work page 2013

[37] [37]

J. Kieffer. Exponential rate of convergence for Lloyd’s method I. IEEE Transactions on Information Theory, 28(2):205–210, 1982

work page 1982

[38] [38]

Kipnis, G

A. Kipnis, G. Reeves, Y .C. Eldar, and A.J. Goldsmith. Compressed sensing under optimal quantization. In 2017 IEEE international symposium on information theory (ISIT), pages 2148–2152. IEEE, 2017

work page 2017

[39] [39]

Kitagawa, J

Q. Kitagawa, J. M ´erigot and B. Thibert. Convergence of a Newton algorithm for semi-discrete optimal transport. Journal of the European Mathematical Society, 21, 2019

work page 2019

[40] [40]

Self-organizing maps: Ophmization approaches

Teuvo Kohonen. Self-organizing maps: Ophmization approaches. In Artificial neural networks, pages 981–990. Elsevier, 1991

work page 1991

[41] [41]

Kolouri, K

S. Kolouri, K. Nadjahi, U. Simsekli, R. Badeau, and G. Rohde. Generalized Sliced Wasserstein dis- tances. Advances in neural information processing systems, 32, 2019

work page 2019

[42] [42]

K. Kurdyka. On gradients of functions definable in o-minimal structures. Annales de l’Institut Fourier, 48(3):769–783, 1998

work page 1998

[43] [43]

Kurdyka and A

K. Kurdyka and A. Parusinski. Wf-stratification of subanalytic functions and the łojasiewicz inequality. Comptes rendus de l’Acad´emie des sciences. S´erie 1, Math´ematique, 1994

work page 1994

[44] [44]

R. Larson. Optimum quantization in dynamic systems. IEEE Transactions on Automatic Control , 12(2):162–168, 1967

work page 1967

[45] [45]

Lebrat, F

L. Lebrat, F. de Gournay, J. Kahn, and P. Weiss. Optimal transport approximation of 2-dimensional measures. SIAM Journal on Imaging Sciences, 12(2):762–787, 2019

work page 2019

[46] [46]

Lion and J.-P

J.-M. Lion and J.-P. Rolin. Int ´egration des fonctions sous-analytiques et volumes des sous-ensembles sous-analytiques. Annales de l’Institut Fourier, 48(3):755–767, 1998

work page 1998

[47] [47]

Liu and B.F

T. Liu and B.F. Lourenc ¸o. Convergence analysis under consistent error bounds.Foundations of Com- putational Mathematics, 24(2):429–479, 2024

work page 2024

[48] [48]

S. Lloyd. Least squares quantization in PCM. IEEE Transactions on Information Theory, 28(2):129– 137, 1982

work page 1982

[49] [49]

T.L. Loi. Lecture 1: O-minimal structures. In The Japanese-Australian Workshop on Real and Com- plex Singularities: JARCS III, volume 43, pages 19–31. Australian National University, Mathematical Sciences Institute, 2010

work page 2010

[50] [50]

Lojasiewicz

S. Lojasiewicz. Une propri ´et´e topologique des sous-ensembles analytiques r ´eels. Les ´equations aux d´eriv´ees partielles, 117:87–89, 1963

work page 1963

[51] [51]

M ´erigot, F

Q. M ´erigot, F. Santambrogio, and C. Sarrazin. Non-asymptotic convergence bounds for Wasserstein approximation using point clouds. In Advances in Neural Information Processing Systems, volume 34, pages 12810–12821. Curran Associates, Inc., 2021

work page 2021

[52] [52]

D. Miller. A preparation theorem for Weierstrass systems. Transactions of the american mathematical society, 358(10):4395–4439, 2006. 26

work page 2006

[53] [53]

A. Opris. On preparation theorems for ran, exp-definable functions. Journal of Logic and Analysis , 15, 2021

work page 2021

[54] [54]

G. Pag `es. Introduction to vector quantization and its applications for numerics. ESAIM: proceedings and surveys, 48:29–79, 2015

work page 2015

[55] [55]

Pag `es and J

G. Pag `es and J. Yu. Pointwise convergence of the Lloyd algorithm in higher dimension.SIAM Journal on Control and Optimization, 54(5):2354–2382, 2016

work page 2016

[56] [56]

A space quantization method for numerical integration

Gilles Pag `es. A space quantization method for numerical integration. Journal of computational and applied mathematics, 89(1):1–38, 1998

work page 1998

[57] [57]

Parusi ´nski

A. Parusi ´nski. On the preparation theorem for subanalytic functions. In New developments in singu- larity theory, pages 193–215. Springer, 2001

work page 2001

[58] [58]

Peyr ´e and M

G. Peyr ´e and M. Cuturi. Computational optimal transport: With applications to data science. Founda- tions and Trends® in Machine Learning, 11(5-6):355–607, 2019

work page 2019

[59] [59]

Rabin, G

J. Rabin, G. Peyr ´e, J. Delon, and M. Bernot. Wasserstein barycenter and its application to texture mixing. In Scale Space and Variational Methods in Computer Vision: Third International Conference, SSVM, pages 435–446. Springer, 2012

work page 2012

[60] [60]

Santambrogio

F. Santambrogio. Optimal transport for applied mathematicians. Birk¨auser, NY, 55(58-63):94, 2015

work page 2015

[61] [61]

The Pfaffian closure of an o-minimal structure

Patrick Speissegger. The Pfaffian closure of an o-minimal structure. Journal f¨ur die reine und ange- wandte Mathematik, 1999(508):189–211, 1999

work page 1999

[62] [62]

J. Z. Sun and V . K. Goyal. Optimal quantization of random measurements in compressed sensing. In 2009 IEEE International Symposium on Information Theory, pages 6–10. IEEE, 2009

work page 2009

[63] [63]

E. Tanguy. Convergence of SGD for training neural networks with Sliced Wasserstein losses. Trans- actions on Machine Learning Research, 2023

work page 2023

[64] [64]

Van den Dries

L. Van den Dries. A generalization of the tarski-seidenberg theorem, and some nondefinability results. Bulletin of the American Mathematical Society, 15(2):189–193, 1986

work page 1986

[65] [65]

van den Dries and P

L. van den Dries and P. Speissegger. O-minimal preparation theorems. Model theory and applications, 11:87–116, 2002

work page 2002

[66] [66]

C. Villani. Optimal Transport: Old and New , volume 338 of Grundlehren der mathematischen Wis- senschaften. Springer Berlin Heidelberg, 2008

work page 2008

[67] [67]

C. Villani. Topics in optimal transportation, volume 58. American Mathematical Soc., 2021

work page 2021

[68] [68]

X. Wu. On convergence of Lloyd’s method I. IEEE Transactions on Information Theory, 38(1):171– 174, 1992

work page 1992

[69] [69]

Zhou, S.-M

Y . Zhou, S.-M. Moosavi-Dezfooli, N.-M. Cheung, and P. Frossard. Adaptive quantization for deep neural network. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1), 2018. 27

work page 2018