Error estimates for deterministic empirical approximations of probability measures

Benjamin Seeger

arxiv: 2510.03451 · v2 · submitted 2025-10-03 · 🧮 math.PR · math.OC

Error estimates for deterministic empirical approximations of probability measures

Benjamin Seeger This is my paper

Pith reviewed 2026-05-18 10:00 UTC · model grok-4.3

classification 🧮 math.PR math.OC

keywords Wasserstein distancedeterministic quantizationempirical approximationerror estimatesmoment conditionsdimension dependenceoptimal ratesunbounded support

0 comments

The pith

Any probability measure with finite moments can be approximated optimally by N uniform discrete points in Wasserstein distance at an explicit rate depending on moments, p, and dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper determines how closely an arbitrary probability measure can be matched by the best choice of N points carrying equal mass 1/N each, where closeness is measured in Wasserstein distance. It derives explicit upper and lower bounds on the smallest achievable distance and shows that this distance tends to zero at a rate governed by the measure's moment order, the Wasserstein exponent p, and the ambient dimension. The rates improve on those available from random sampling precisely when the dimension is low or the support is unbounded. A reader cares because the bounds give concrete guidance on the minimal number of atoms needed to control approximation error in transport, integration, or simulation tasks without relying on randomness.

Core claim

Estimates are obtained for the optimal approximation distance, with an explicit rate of convergence to 0 as the number of points tends to infinity that depends on the moment order, the parameter in the Wasserstein distance, and the dimension. In certain low-dimensional regimes and for measures with unbounded support, the rates are improvements over those obtained through other methods, including through random sampling. Except for some critical cases, the rates are shown to be optimal.

What carries the argument

The minimal Wasserstein distance from the given measure to the collection of all N-point uniform discrete probability measures; moment assumptions are used to bound this quantity from above and below by dimension-dependent powers of 1/N.

If this is right

The approximation error vanishes at a polynomial rate in N whose exponent improves when dimension is small relative to the available moments.
The bounds continue to hold for measures with unbounded support, removing the compact-support restriction common in earlier quantization results.
Optimality outside critical cases implies no deterministic point placement can asymptotically outperform the stated rates.
The estimates justify replacing random sampling by deterministic constructions that achieve the same or better error guarantees in low-dimensional settings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The rates could be used to choose the smallest N that keeps transport or integration error below a prescribed tolerance when moment information is available.
In high dimensions the deterministic advantage vanishes, suggesting that the curse of dimensionality affects both random and optimal deterministic approximations equally.
Numerical checks on standard families such as Gaussians across dimensions would locate the boundary between the improved-rate and standard-rate regimes.
The same moment-based comparison technique might extend to other distances or to non-uniform weights, though the paper does not pursue those cases.

Load-bearing premise

The probability measure possesses finite moments of order high enough relative to the Wasserstein parameter and dimension so that the distance to any discrete approximation remains finite.

What would settle it

A probability measure with the required finite moments for which the minimal Wasserstein distance to any N-point uniform discrete measure decays at a slower rate than the derived upper bound, or faster than the lower bound in non-critical regimes, would disprove the estimates.

read the original abstract

The question of optimally approximating an arbitrary probability measure in the Wasserstein distance by a discrete one with uniform weights is considered. Estimates are obtained for the optimal approximation distance, with an explicit rate of convergence to $0$ as the number of points tends to infinity that depends on the moment order, the parameter in the Wasserstein distance, and the dimension. In certain low-dimensional regimes and for measures with unbounded support, the rates are improvements over those obtained through other methods, including through random sampling. Except for some critical cases, the rates are shown to be optimal.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives explicit rates for the minimal Wasserstein distance to any n-point uniform discrete measure, with improvements over random sampling in low-d unbounded cases and optimality except in critical regimes.

read the letter

The main thing to know is that this work derives explicit rates for the smallest Wasserstein error when approximating a measure by an n-point uniform discrete one. The rates depend on the available moment order, the Wasserstein exponent p, and dimension d. They improve on random sampling for low-dimensional measures with unbounded support, and the paper shows they are optimal except in a few critical cases.

Referee Report

0 major / 2 minor

Summary. The manuscript studies the problem of optimally approximating an arbitrary probability measure μ by an n-point discrete measure with uniform weights in the p-Wasserstein distance W_p. It derives explicit upper bounds on the minimal approximation error that decay to zero at rates depending on the moment order q of μ, the parameters p and d, and shows these rates are optimal except in certain critical regimes. In low-dimensional cases with unbounded support the deterministic rates improve on those available from random sampling.

Significance. If the stated rates and optimality claims hold, the work supplies deterministic, explicit error bounds for Wasserstein quantization that are sharper than Monte-Carlo alternatives in selected regimes. The moment hypotheses are minimal for the distances to be finite, and the optimality statements (outside critical cases) strengthen the contribution to the theory of empirical measures and optimal transport.

minor comments (2)

The abstract and introduction should explicitly state the precise range of p, q, d for which the main theorems apply, including the critical cases where optimality is not claimed.
Notation for the optimal n-point approximation error (e.g., inf over discrete measures) should be introduced once and used consistently throughout the proofs.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary and significance assessment of our manuscript on error estimates for deterministic empirical approximations. The recommendation of minor revision is noted. No specific major comments appear in the report, so we have no individual points requiring direct rebuttal or revision at this stage. We remain available to address any additional feedback.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained analytic estimates

full rationale

The paper derives explicit convergence rates for the optimal Wasserstein approximation error by n-point discrete measures directly from moment assumptions and properties of the Wasserstein metric. No load-bearing step reduces by construction to a fitted parameter, self-definition, or self-citation chain; the moment condition is the minimal hypothesis needed for the distances to be finite, and optimality is shown via matching lower bounds rather than by renaming or reparameterizing inputs. The central claims rest on independent analytic arguments that do not presuppose the target rates.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work relies on the established theory of Wasserstein distances and moment conditions for probability measures; no new free parameters, invented entities, or ad-hoc axioms are introduced beyond standard assumptions in optimal transport.

axioms (1)

standard math Wasserstein distance is well-defined and finite when the measure has finite moments of appropriate order
Invoked implicitly to ensure the approximation error is finite and the rates make sense.

pith-pipeline@v0.9.0 · 5604 in / 1290 out tokens · 47324 ms · 2026-05-18T10:00:35.316240+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

eN;d,p(μ) := inf Wp(μ, μN_x) with rates N^{-1/d} ∨ N^{-1/p+1/q} (Theorem 1) obtained by multiscale Lp quantity and dyadic cube assignment
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

multiscale analysis over dyadic partitions D_ℓ of cubes Q_n and deterministic point placement via Lemmas 3–4

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.