Stability of Partitions Induced by Nearest-Center Assignment Under Perturbations
Pith reviewed 2026-05-10 10:03 UTC · model grok-4.3
The pith
The minimum margin of each point governs whether nearest-center assignments remain unchanged under perturbations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
With fixed centers c1 to ck, for a configuration X the assignment A assigns each xi to its closest center ℓi. The margin γi of xi is the minimum over j ≠ ℓi of the difference ||xi - cj|| minus ||xi - cℓi||. If ε is less than γmin over 2, where γmin is the smallest γi, then for any X' within ε in max-norm, A(X') equals A(X). Conversely any xi that changes assignment satisfies γi ≤ 2ε. Configurations exist where the partition changes for arbitrarily small ε, showing the condition is sufficient but not necessary.
What carries the argument
The pointwise margin γi measuring the separation to the decision boundary between centers, which directly controls the stability radius r(X,A) of the induced partition.
If this is right
- When all margins are larger than twice the perturbation size, the entire induced partition on the n points stays identical.
- Changes in assignment can only happen for points whose margin is at most twice the perturbation size, localizing potential instability.
- If the sum of perturbation sizes over discrete time steps stays below the initial stability radius, the assignment remains constant throughout.
- The expected disagreement between original and perturbed assignments can be bounded using tail probabilities on the margins γi.
Where Pith is reading between the lines
- One could use the margin values to identify which points are most likely to flip under noise and collect more precise measurements for those points.
- Applying the same margin analysis when centers are learned from the data rather than fixed would likely identify additional instability sources from center estimation.
- Maximizing the minimum margin through point placement or feature selection could serve as a design principle for robust clustering applications.
Load-bearing premise
The centers remain fixed and are not updated when the points are perturbed, and distances are Euclidean while perturbations use the infinity norm on the coordinates.
What would settle it
A concrete counter-example would be a configuration of points and centers in which every point has margin strictly larger than 2ε yet a perturbation of size at most ε causes at least one point to be assigned to a different center.
read the original abstract
We study clustering through the partitions it induces on a finite labeled set $[n]=\{1,\dots,n\}$, and analyze how these partitions change under perturbations of a point configuration $X=(x_1,\dots,x_n)\in(\mathbb{R}^d)^n$. We equip the space of partitions $\Pi_n$ with a normalized pairwise disagreement metric $d(\cdot,\cdot)$, and define the stability radius $r(X,A)=\sup\{\varepsilon\ge0: A(X')=A(X)$ whenever $|X-X'\|\le\varepsilon\}$, where $\|X-X'\|=\max_i\|x_i-x'_i\|$. Our main results concern nearest-center assignment with fixed centers $\{c_1,\dots,c_k\}\subset\mathbb{R}^d$. For each point, we define the margin $\gamma_i=\min_{j\ne\ell_i}(\|x_i-c_j\|-\|x_i-c_{\ell_i}\|)$ and $\gamma_{\min}=\min_i\gamma_i$, where $\ell_i$ denotes the assigned center. We show that if $\varepsilon<\gamma_{\min}/2$, then no assignments change under perturbation and hence $A(X')=A(X)$. Conversely, any point that changes its assigned center must satisfy $\gamma_i\le2\varepsilon$, showing that instability is localized near decision boundaries. We construct configurations in which arbitrarily small perturbations $\|X-X'\|\le\varepsilon$ alter the induced partition, demonstrating that the margin condition is sufficient but not necessary for stability. We further extend the framework to a discrete-time setting, showing that if $\sum_t \delta_t<r(X(0),A)$, then $A(X(t))=A(X(0))$, and we give a probabilistic bound on $\mathbb{E}[d(A(X),A(X'))]$ in terms of tail probabilities relative to $\gamma_i$. This framework identifies the margin as the key quantity governing both worst-case and average stability, and provides explicit conditions under which clustering-induced partitions remain invariant in a fixed-center Euclidean model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies the stability of partitions induced by nearest-center assignment on a finite point set X in Euclidean space under perturbations. It equips the space of partitions with a normalized pairwise disagreement metric and defines the stability radius r(X,A) as the largest ε such that any perturbation of size at most ε (in the max-over-points norm) preserves the assignment A. For each point it introduces the margin γ_i as the difference between its distance to the assigned center and the next-closest center, proves that ε < γ_min/2 implies A(X')=A(X) for all such perturbations, shows the converse that any changing point must satisfy γ_i ≤ 2ε, constructs configurations in which arbitrarily small perturbations alter the partition (showing the margin bound is sufficient but not necessary), and extends the analysis to accumulated perturbations over discrete time steps and to a probabilistic bound on expected disagreement in terms of tail probabilities relative to the γ_i.
Significance. If the central claims hold, the work supplies a clean geometric criterion—the per-point margin—for determining when fixed-center clustering assignments remain invariant under noise. The deterministic results follow directly from the triangle inequality applied to distance differences, while the constructions, discrete-time accumulation result, and probabilistic tail bound provide additional insight into both worst-case and average-case behavior. Credit is due for the explicit identification of the margin as the governing quantity and for the parameter-free nature of the derived bounds.
minor comments (3)
- The definition of the perturbation norm ||X-X'|| = max_i ||x_i - x'_i|| should explicitly state the norm on each R^d vector (Euclidean or otherwise) to make the factor of 2 in the margin bound fully transparent; this affects only the constant, not the logic.
- In the probabilistic extension, the distribution assumed for the perturbation X' should be stated explicitly (e.g., independent per-coordinate or per-point noise) so that the tail-probability bound can be verified without ambiguity.
- The normalized pairwise disagreement metric d(·,·) on partitions is introduced in the abstract but would benefit from a one-sentence definition or reference in the main text for readers unfamiliar with partition metrics.
Simulated Author's Rebuttal
We thank the referee for the accurate and positive summary of our manuscript, which correctly identifies the stability radius, the role of the per-point margins γ_i, the deterministic bounds via the triangle inequality, the counterexample constructions, the discrete-time accumulation result, and the probabilistic tail bound. We appreciate the recommendation for minor revision and the recognition that the margin provides a clean, parameter-free criterion for invariance under perturbations.
Circularity Check
No significant circularity detected
full rationale
The derivation proceeds directly from the definitions of margin γ_i = min_{j≠ℓ_i} (‖x_i - c_j‖ - ‖x_i - c_ℓ_i‖) and perturbation bound ‖X - X'‖ ≤ ε in the infinity norm. The key inequality follows from the triangle inequality: each distance changes by at most ε, so any pairwise difference changes by at most 2ε. Thus ε < γ_min/2 forces all nearest-center assignments to remain unchanged, and the converse is immediate as the contrapositive. No parameter is fitted to data, no self-citation is load-bearing, and no ansatz or renaming is invoked; the results are self-contained algebraic consequences of the stated model assumptions on fixed centers and finite point sets.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Perturbations are measured using the max-norm ||X - X'|| = max_i ||x_i - x'_i|| over a finite set of points in Euclidean space R^d
- domain assumption Centers {c_1, ..., c_k} are fixed and the assignment uses Euclidean nearest-center rule
invented entities (2)
-
stability radius r(X,A)
no independent evidence
-
per-point margin γ_i
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Jain, A. K. (2010). Data clustering: 50 years beyond K-means.Pattern Recognition Letters, 31(8), 651–666
work page 2010
-
[2]
Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms.IEEE Transactions on Neural Networks, 16(3), 645–678
work page 2005
-
[3]
Ben-David, S., von Luxburg, U., & Pál, D. (2006). A sober look at clustering stability. InProceedings of the 19th Annual Conference on Learning Theory
work page 2006
-
[4]
von Luxburg, U. (2010). Clustering stability: an overview.Foundations and Trends in Machine Learning, 2(3), 235–274
work page 2010
-
[5]
Stanley, R. P. (2011).Enumerative Combinatorics, Volume 1(2nd ed.). Cambridge University Press
work page 2011
-
[6]
Rand, W.M.(1971).Objectivecriteriafortheevaluationofclusteringmethods.Journal of the American Statistical Association, 66(336), 846–850. Princeton University, USA Email address:nahid.sabit@princeton.edu United World College, Mostar, Bosnia & Herzegovina Email address:faija.anjum@uwcmostar.ba
work page 1971
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.