pith. sign in

arxiv: 2604.15578 · v1 · submitted 2026-04-16 · 🧮 math.CO · math.OC· math.PR

Stability of Partitions Induced by Nearest-Center Assignment Under Perturbations

Pith reviewed 2026-05-10 10:03 UTC · model grok-4.3

classification 🧮 math.CO math.OCmath.PR
keywords nearest center clusteringpartition stabilityperturbation analysismargin conditionassignment invariancestability radiusEuclidean distance
0
0 comments X

The pith

The minimum margin of each point governs whether nearest-center assignments remain unchanged under perturbations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper analyzes the stability of partitions produced by assigning each point in a finite set to its nearest center among a fixed collection of centers. It defines the margin of a point as the difference in Euclidean distance to its assigned center versus the next closest center, and proves that if the smallest such margin across all points exceeds twice the perturbation size measured in the infinity norm, then no point changes its assignment. This provides an explicit condition under which small changes to the data leave the clustering partition completely invariant. The analysis further shows that any point which does switch must have had a margin no larger than twice the perturbation, and extends the result to chains of perturbations and probabilistic expectations of disagreement.

Core claim

With fixed centers c1 to ck, for a configuration X the assignment A assigns each xi to its closest center ℓi. The margin γi of xi is the minimum over j ≠ ℓi of the difference ||xi - cj|| minus ||xi - cℓi||. If ε is less than γmin over 2, where γmin is the smallest γi, then for any X' within ε in max-norm, A(X') equals A(X). Conversely any xi that changes assignment satisfies γi ≤ 2ε. Configurations exist where the partition changes for arbitrarily small ε, showing the condition is sufficient but not necessary.

What carries the argument

The pointwise margin γi measuring the separation to the decision boundary between centers, which directly controls the stability radius r(X,A) of the induced partition.

If this is right

  • When all margins are larger than twice the perturbation size, the entire induced partition on the n points stays identical.
  • Changes in assignment can only happen for points whose margin is at most twice the perturbation size, localizing potential instability.
  • If the sum of perturbation sizes over discrete time steps stays below the initial stability radius, the assignment remains constant throughout.
  • The expected disagreement between original and perturbed assignments can be bounded using tail probabilities on the margins γi.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • One could use the margin values to identify which points are most likely to flip under noise and collect more precise measurements for those points.
  • Applying the same margin analysis when centers are learned from the data rather than fixed would likely identify additional instability sources from center estimation.
  • Maximizing the minimum margin through point placement or feature selection could serve as a design principle for robust clustering applications.

Load-bearing premise

The centers remain fixed and are not updated when the points are perturbed, and distances are Euclidean while perturbations use the infinity norm on the coordinates.

What would settle it

A concrete counter-example would be a configuration of points and centers in which every point has margin strictly larger than 2ε yet a perturbation of size at most ε causes at least one point to be assigned to a different center.

read the original abstract

We study clustering through the partitions it induces on a finite labeled set $[n]=\{1,\dots,n\}$, and analyze how these partitions change under perturbations of a point configuration $X=(x_1,\dots,x_n)\in(\mathbb{R}^d)^n$. We equip the space of partitions $\Pi_n$ with a normalized pairwise disagreement metric $d(\cdot,\cdot)$, and define the stability radius $r(X,A)=\sup\{\varepsilon\ge0: A(X')=A(X)$ whenever $|X-X'\|\le\varepsilon\}$, where $\|X-X'\|=\max_i\|x_i-x'_i\|$. Our main results concern nearest-center assignment with fixed centers $\{c_1,\dots,c_k\}\subset\mathbb{R}^d$. For each point, we define the margin $\gamma_i=\min_{j\ne\ell_i}(\|x_i-c_j\|-\|x_i-c_{\ell_i}\|)$ and $\gamma_{\min}=\min_i\gamma_i$, where $\ell_i$ denotes the assigned center. We show that if $\varepsilon<\gamma_{\min}/2$, then no assignments change under perturbation and hence $A(X')=A(X)$. Conversely, any point that changes its assigned center must satisfy $\gamma_i\le2\varepsilon$, showing that instability is localized near decision boundaries. We construct configurations in which arbitrarily small perturbations $\|X-X'\|\le\varepsilon$ alter the induced partition, demonstrating that the margin condition is sufficient but not necessary for stability. We further extend the framework to a discrete-time setting, showing that if $\sum_t \delta_t<r(X(0),A)$, then $A(X(t))=A(X(0))$, and we give a probabilistic bound on $\mathbb{E}[d(A(X),A(X'))]$ in terms of tail probabilities relative to $\gamma_i$. This framework identifies the margin as the key quantity governing both worst-case and average stability, and provides explicit conditions under which clustering-induced partitions remain invariant in a fixed-center Euclidean model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper studies the stability of partitions induced by nearest-center assignment on a finite point set X in Euclidean space under perturbations. It equips the space of partitions with a normalized pairwise disagreement metric and defines the stability radius r(X,A) as the largest ε such that any perturbation of size at most ε (in the max-over-points norm) preserves the assignment A. For each point it introduces the margin γ_i as the difference between its distance to the assigned center and the next-closest center, proves that ε < γ_min/2 implies A(X')=A(X) for all such perturbations, shows the converse that any changing point must satisfy γ_i ≤ 2ε, constructs configurations in which arbitrarily small perturbations alter the partition (showing the margin bound is sufficient but not necessary), and extends the analysis to accumulated perturbations over discrete time steps and to a probabilistic bound on expected disagreement in terms of tail probabilities relative to the γ_i.

Significance. If the central claims hold, the work supplies a clean geometric criterion—the per-point margin—for determining when fixed-center clustering assignments remain invariant under noise. The deterministic results follow directly from the triangle inequality applied to distance differences, while the constructions, discrete-time accumulation result, and probabilistic tail bound provide additional insight into both worst-case and average-case behavior. Credit is due for the explicit identification of the margin as the governing quantity and for the parameter-free nature of the derived bounds.

minor comments (3)
  1. The definition of the perturbation norm ||X-X'|| = max_i ||x_i - x'_i|| should explicitly state the norm on each R^d vector (Euclidean or otherwise) to make the factor of 2 in the margin bound fully transparent; this affects only the constant, not the logic.
  2. In the probabilistic extension, the distribution assumed for the perturbation X' should be stated explicitly (e.g., independent per-coordinate or per-point noise) so that the tail-probability bound can be verified without ambiguity.
  3. The normalized pairwise disagreement metric d(·,·) on partitions is introduced in the abstract but would benefit from a one-sentence definition or reference in the main text for readers unfamiliar with partition metrics.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the accurate and positive summary of our manuscript, which correctly identifies the stability radius, the role of the per-point margins γ_i, the deterministic bounds via the triangle inequality, the counterexample constructions, the discrete-time accumulation result, and the probabilistic tail bound. We appreciate the recommendation for minor revision and the recognition that the margin provides a clean, parameter-free criterion for invariance under perturbations.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The derivation proceeds directly from the definitions of margin γ_i = min_{j≠ℓ_i} (‖x_i - c_j‖ - ‖x_i - c_ℓ_i‖) and perturbation bound ‖X - X'‖ ≤ ε in the infinity norm. The key inequality follows from the triangle inequality: each distance changes by at most ε, so any pairwise difference changes by at most 2ε. Thus ε < γ_min/2 forces all nearest-center assignments to remain unchanged, and the converse is immediate as the contrapositive. No parameter is fitted to data, no self-citation is load-bearing, and no ansatz or renaming is invoked; the results are self-contained algebraic consequences of the stated model assumptions on fixed centers and finite point sets.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The central claims rest on standard geometric definitions and assumptions rather than fitted parameters or new physical entities. The stability radius and margins are introduced as derived quantities from the model.

axioms (2)
  • standard math Perturbations are measured using the max-norm ||X - X'|| = max_i ||x_i - x'_i|| over a finite set of points in Euclidean space R^d
    Invoked in the definition of the stability radius and all perturbation bounds
  • domain assumption Centers {c_1, ..., c_k} are fixed and the assignment uses Euclidean nearest-center rule
    Core to the definition of A(X) and the margin γ_i
invented entities (2)
  • stability radius r(X,A) no independent evidence
    purpose: Quantifies the largest perturbation size that leaves the induced partition unchanged
    Newly defined quantity central to the stability analysis
  • per-point margin γ_i no independent evidence
    purpose: Measures distance to the decision boundary for each point's assignment
    Introduced to localize instability and derive the ε < γ_min/2 condition

pith-pipeline@v0.9.0 · 5680 in / 1433 out tokens · 52413 ms · 2026-05-10T10:03:28.702276+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages

  1. [1]

    Jain, A. K. (2010). Data clustering: 50 years beyond K-means.Pattern Recognition Letters, 31(8), 651–666

  2. [2]

    Xu, R., & Wunsch, D. (2005). Survey of clustering algorithms.IEEE Transactions on Neural Networks, 16(3), 645–678

  3. [3]

    Ben-David, S., von Luxburg, U., & Pál, D. (2006). A sober look at clustering stability. InProceedings of the 19th Annual Conference on Learning Theory

  4. [4]

    von Luxburg, U. (2010). Clustering stability: an overview.Foundations and Trends in Machine Learning, 2(3), 235–274

  5. [5]

    Stanley, R. P. (2011).Enumerative Combinatorics, Volume 1(2nd ed.). Cambridge University Press

  6. [6]

    Princeton University, USA Email address:nahid.sabit@princeton.edu United World College, Mostar, Bosnia & Herzegovina Email address:faija.anjum@uwcmostar.ba

    Rand, W.M.(1971).Objectivecriteriafortheevaluationofclusteringmethods.Journal of the American Statistical Association, 66(336), 846–850. Princeton University, USA Email address:nahid.sabit@princeton.edu United World College, Mostar, Bosnia & Herzegovina Email address:faija.anjum@uwcmostar.ba