pith. sign in

arxiv: 1907.02006 · v1 · pith:RW2NI6GCnew · submitted 2019-07-03 · 🧮 math.PR · math.ST· stat.TH

Bounding quantiles of Wasserstein distance between true and empirical measure

Pith reviewed 2026-05-25 09:41 UTC · model grok-4.3

classification 🧮 math.PR math.STstat.TH
keywords Wasserstein distanceempirical measurequantilesasymptotic boundsconfidence regionsunit intervalmixture distributions
0
0 comments X

The pith

The normalized quantiles of the Wasserstein distance to the empirical measure reach their asymptotic maximum for convex combinations of the two-point uniform on {0,1} and the uniform on [0,1].

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the random Wasserstein distance between a fixed probability distribution P on the unit interval and the empirical measure formed from N independent samples. It establishes that the upper quantiles of this distance, after suitable normalization, are asymptotically largest precisely when P belongs to a one-parameter family of mixtures between the uniform distribution on the two endpoints and the Lebesgue measure on the whole interval. This extremal property directly supplies explicit asymptotic confidence regions for the unknown P. The argument is carried out for the one-dimensional case; numerical checks are provided for possible higher-dimensional analogues.

Core claim

The main result states that the normalized quantiles of the Wasserstein distance between P and its empirical measure are asymptotically maximised when P is a convex combination of the uniform distribution supported on {0,1} and the uniform distribution on [0,1]. This characterisation yields explicit asymptotic confidence regions for P.

What carries the argument

Asymptotic maximisation of the normalised quantiles of the Wasserstein distance over the choice of P.

If this is right

  • Explicit asymptotic confidence regions for the unknown measure P can be written down in closed form.
  • The worst-case distributions for quantile bounds on the Wasserstein distance belong to the indicated one-parameter family.
  • The same extremal family governs the large-sample behaviour of the distance quantiles uniformly over all P.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar maximisation may hold for other optimal-transport costs once the ambient space is fixed.
  • The result supplies a concrete benchmark against which numerical or Monte-Carlo approximations of Wasserstein quantiles can be validated.
  • In statistical applications the explicit regions give a practical way to construct tests or bands that remain valid without further tuning.

Load-bearing premise

The underlying space is the unit interval equipped with the standard Wasserstein metric and the samples are i.i.d.

What would settle it

For large N, compute the relevant quantile of the Wasserstein distance for a distribution P outside the claimed family and check whether it exceeds the quantile obtained from the extremal mixture; any consistent excess would refute the maximisation claim.

Figures

Figures reproduced from arXiv: 1907.02006 by Johannes Wiesel, Martin N. A. Tegn\'er, Samuel N. Cohen.

Figure 1
Figure 1. Figure 1: The distribution function of the rescaled Wasserstein distance Pn−1 i=1 |B(qi)|(xi+1 − xi) for different measures with sup￾port {x1, . . . , xn}. Left: n = 10, Right: n = 1000. In both fig￾ures, the case P = 1 2 (δ0 + δ1) is in black and P = U({x1, . . . , xn}) in blue, and convex combinations of the two in red. Green lines are from 20 different measures generated randomly. distributions with λ ∈ (0, 1) yi… view at source ↗
Figure 2
Figure 2. Figure 2: Estimated value of λ(α) at each confidence level α, as in Proposition 2.2, computed in the case n = 10. The Wasserstein distance with `1-norm between P and a measure P˜ with support x˜ of size ˜n × m˜ and probabilities p˜ ∈ P(x˜) is given by W(P, P˜) = inf π∈Π(P,P˜) X i,j,k,l ||xij − x˜kl||`1 πijkl (4.2) where Π(P, P˜) is the set of couplings in P([0, 1]4 ) that can be identified with tensors π of dimensio… view at source ↗
Figure 3
Figure 3. Figure 3: A probability measure on [0, 1]2 supported on the ver￾tices of a 3 × 3 grid. the interior-point method of Andersen and Andersen [2] to solve the linear program. 0.04 0.06 0.08 0.10 0.12 0.14 0.16 d 0.0 0.2 0.4 0.6 0.8 1.0 CDF [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Empirical distribution function of W(P, Pˆ 100), from M = 500 draws, with the measure shown in [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: A maximising measure of the empirical quantile func￾tion of W(P, Pˆ 100) at level α = 0.95, from M = 100 draws [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Maximising measures of the empirical quantile function of W(P, Pˆ 100) at different confidence levels, from M = 100 draws. x 0.0 0.2 0.4 0.6 0.8 1.0 x 0.0 0.2 0.4 0.6 0.8 1.0 p 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.05-quantile x 0.0 0.2 0.4 0.6 0.8 1.0 x 0.0 0.2 0.4 0.6 0.8 1.0 p 0.000 0.025 0.050 0.075 0.100 0.125 0.150 0.175 0.1-quantile [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Maximising measures of the empirical quantile function of W(P, Pˆ 100) at different confidence levels, from M = 100 draws. Conjecture 4.1. Let d ≥ 1, K ⊆ R d be a convex compact set such that 0 < µ(K), where µ denotes the Lebesgue measure on R d . Furthermore let x0, x1 ∈ R d attain argmax{ky0 − y1k`1 : y0, y1 ∈ K} [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗
read the original abstract

Consider the empirical measure, $\hat{\mathbb{P}}_N$, associated to $N$ i.i.d. samples of a given probability distribution $\mathbb{P}$ on the unit interval. For fixed $\mathbb{P}$ the Wasserstein distance between $\hat{\mathbb{P}}_N$ and $\mathbb{P}$ is a random variable on the sample space $[0,1]^N$. Our main result is that its normalised quantiles are asymptotically maximised when $\mathbb{P}$ is a convex combination between the uniform distribution supported on the two points $\{0,1\}$ and the uniform distribution on the unit interval $[0,1]$. This allows us to obtain explicit asymptotic confidence regions for the underlying measure $\mathbb{P}$. We also suggest extensions to higher dimensions with numerical evidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript studies the random variable given by the 1-Wasserstein distance between an empirical measure formed from N i.i.d. samples and the underlying probability P supported on [0,1]. The central claim is that the normalized quantiles of this distance are asymptotically maximized precisely when P belongs to the one-parameter family of convex combinations of the two-point uniform measure on {0,1} and Lebesgue measure on [0,1]; the resulting explicit form yields asymptotic confidence regions for P. Numerical illustrations are provided for extensions to higher-dimensional settings.

Significance. If the asymptotic maximization result holds, the paper supplies a concrete, usable family of worst-case distributions that deliver explicit asymptotic confidence regions for an unknown measure under the Wasserstein metric. This is a useful contribution to nonparametric statistics and optimal transport, where such explicit quantile bounds are otherwise unavailable. The identification of the extremal family and the explicit confidence-region construction are the primary strengths.

major comments (1)
  1. The abstract and introduction state the maximization result for the unit interval equipped with the 1-Wasserstein metric and i.i.d. sampling, but the manuscript must make explicit the precise regularity conditions (e.g., moment assumptions or continuity requirements on the quantile functions) under which the asymptotic equivalence holds; without these the claim that the identified family is maximal cannot be verified from the given statement alone.
minor comments (1)
  1. The numerical evidence for higher dimensions is mentioned but not accompanied by tables or figures showing sample sizes, dimension values, or quantitative comparison to the one-dimensional case; adding such detail would improve clarity.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and the constructive suggestion regarding regularity conditions. We address the point below and will incorporate the clarification in the revised manuscript.

read point-by-point responses
  1. Referee: The abstract and introduction state the maximization result for the unit interval equipped with the 1-Wasserstein metric and i.i.d. sampling, but the manuscript must make explicit the precise regularity conditions (e.g., moment assumptions or continuity requirements on the quantile functions) under which the asymptotic equivalence holds; without these the claim that the identified family is maximal cannot be verified from the given statement alone.

    Authors: We agree that the conditions should be stated explicitly. Because the support is the compact interval [0,1], every probability measure P has finite moments of all orders, so no additional moment assumptions are required. The result holds for every P in the space of Borel probability measures on [0,1]; the only regularity used is that quantile functions are non-decreasing and right-continuous, which is the standard definition. In the revised version we will add a precise statement of these assumptions immediately after the abstract and in the introduction, making clear that the asymptotic maximisation holds for all such P with no further restrictions. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The abstract presents the central result as a derived theorem on the asymptotic maximizers of normalized quantiles of the 1-Wasserstein distance between empirical and true measures on [0,1] under i.i.d. sampling. No equations, fitted parameters, or self-citations are shown that reduce the claimed maximizers to a tautological re-expression of the inputs. The result is stated as a property of the specific setting rather than a self-definitional or fitted-input prediction, and the derivation chain is self-contained against external benchmarks with no load-bearing self-citation or ansatz smuggling visible.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the i.i.d. sampling model on the unit interval and the definition of the Wasserstein metric; no free parameters, new entities, or additional axioms are introduced in the abstract.

axioms (2)
  • domain assumption The N samples are i.i.d. draws from the unknown distribution P supported on [0,1].
    Explicitly stated in the abstract as the setup for the empirical measure.
  • domain assumption The distance under consideration is the Wasserstein metric on the unit interval.
    The entire analysis is conducted with respect to this metric on this space.

pith-pipeline@v0.9.0 · 5666 in / 1427 out tokens · 43302 ms · 2026-05-25T09:41:24.798992+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

27 extracted references · 27 canonical work pages · 2 internal anchors

  1. [1]

    On optimal matchings.Combinatorica, 4(4):259–264, 1984

    Mikl´ os Ajtai, J´ anos Koml´ os, and G´ abor Tusn´ ady. On optimal matchings.Combinatorica, 4(4):259–264, 1984

  2. [2]

    The MOSEK interior point optimizer for lin- ear programming: an implementation of the homogeneous algorithm

    Erling D Andersen and Knud D Andersen. The MOSEK interior point optimizer for lin- ear programming: an implementation of the homogeneous algorithm. In High performance optimization, pages 197–232. Springer, 2000

  3. [3]

    Stochastic orders and their application to a unified approach to various concepts of dependence and association

    Reinhard Bergmann. Stochastic orders and their application to a unified approach to various concepts of dependence and association. Stochastic Order and Decisions under Risk , 1991

  4. [4]

    On the performance of clustering in Hilbert spaces

    G´ erard Biau, Luc Devroye, and G´ abor Lugosi. On the performance of clustering in Hilbert spaces. IEEE Transactions on Information Theory , 54(2):781–790, 2008

  5. [5]

    Quantitative concentration inequalities for empirical measures on non-compact spaces

    Fran¸ cois Bolley, Arnaud Guillin, and C´ edric Villani. Quantitative concentration inequalities for empirical measures on non-compact spaces. Probability Theory and Related Fields, 137(3- 4):541–593, 2007

  6. [6]

    Central limit theorems for the Wasser- stein distance between the empirical and the true distributions

    Eduorda del Barrio, Evarist Gin´ e, and Carlos Matr´ an. Central limit theorems for the Wasser- stein distance between the empirical and the true distributions. Annals of Probability, pages 1009–1071, 1999

  7. [7]

    Quantization of probability distributions under norm-based distortion measures

    Sylvain Delattre, Siegfried Graf, Harald Luschgy, and Gilles Pages. Quantization of probability distributions under norm-based distortion measures. Statistics & Decisions , 22(4/2004):261–282, 2004

  8. [8]

    Constructive quantization: Ap- proximation by empirical measures

    Steffen Dereich, Michael Scheutzow, and Reik Schottstedt. Constructive quantization: Ap- proximation by empirical measures. In Annales de l’IHP Probabilit´ es et statistiques , vol- ume 49, pages 1183–1203, 2013

  9. [9]

    The speed of mean Glivenko–Cantelli convergence

    Richard Mansfield Dudley. The speed of mean Glivenko–Cantelli convergence. The Annals of Mathematical Statistics, 40(1):40–50, 1969

  10. [10]

    Multiple comparisons among means

    Olive Jean Dunn. Multiple comparisons among means. Journal of the American Statistical Association, 56(293):52–64, 1961

  11. [11]

    On the rate of convergence in Wasserstein distance of the empirical measure

    Nicolas Fournier and Arnaud Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. Probability Theory and Related Fields , 162(3-4):707–738, 2015

  12. [12]

    Rate of convergence of the Nanbu particle system for hard potentials and Maxwell molecules

    Nicolas Fournier, St´ ephane Mischler, et al. Rate of convergence of the Nanbu particle system for hard potentials and Maxwell molecules. The Annals of Probability , 44(1):589–627, 2016

  13. [13]

    A formula for the tail probability of a multivari- ate normal distribution and its applications

    J¨ urg H¨ usler, Regina Y Liu, and Kesar Singh. A formula for the tail probability of a multivari- ate normal distribution and its applications. Journal of multivariate analysis , 82(2):422–430, 2002

  14. [14]

    Mathematical Methods of Statistics, 19(2):136–150, 2010

    Thomas Lalo¨ e.l1-quantization and clustering in Banach spaces. Mathematical Methods of Statistics, 19(2):136–150, 2010

  15. [15]

    Stochastic ordering of multivariate normal distributions.Annals of the Institute of Statistical Mathematics , 53(3):567–575, 2001

    Alfred M¨ uller. Stochastic ordering of multivariate normal distributions.Annals of the Institute of Statistical Mathematics , 53(3):567–575, 2001

  16. [16]

    Gaussian processes for global opti- mization

    Michael Osborne, Roman Garnett, and Stephen Roberts. Gaussian processes for global opti- mization. In Learning and Intelligent Optimisation , pages 1–15. Springer, 2009

  17. [17]

    Optimal Delaunay and Voronoi quantization schemes for pricing american style options

    Gilles Pag` es and Benedikt Wilbertz. Optimal Delaunay and Voronoi quantization schemes for pricing american style options. In Numerical methods in Finance , pages 171–213. Springer, 2012

  18. [18]

    The laplace method for probability measures in banach spaces

    Vladimir Piterbarg and Vadim Rolandovich Fatalov. The laplace method for probability measures in banach spaces. Russian Mathematical Surveys , 50(6):1151, 1995

  19. [19]

    Shift-coupling and convergence rates of ergodic averages

    Gareth O Roberts and Jeffrey S Rosenthal. Shift-coupling and convergence rates of ergodic averages. Stochastic Models, 13(1):147–165, 1997

  20. [20]

    Convergence of the empirical process in Mallows distance, with an application to bootstrap performance

    Richard Samworth and Oliver Johnson. Convergence of the empirical process in Mallows distance, with an application to bootstrap performance. arXiv preprint math/0406603, 2004

  21. [21]

    Taking the human out of the loop: A review of Bayesian optimization

    Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. Taking the human out of the loop: A review of Bayesian optimization. Proceedings of the IEEE , 104(1):148–175, 2016

  22. [22]

    The transportation cost from the uniform measure to the empirical measure in dimension≥ 3

    Michel Talagrand. The transportation cost from the uniform measure to the empirical measure in dimension≥ 3. The Annals of Probability , pages 919–959, 1994

  23. [23]

    New concentration inequalities in product spaces

    Michel Talagrand. New concentration inequalities in product spaces. Inventiones mathemat- icae, 126(3):505–563, 1996

  24. [24]

    Asymptotics of the distribution of the integral of the absolute value of the Brownian bridge for large arguments

    Leonid Tolmatz. Asymptotics of the distribution of the integral of the absolute value of the Brownian bridge for large arguments. The Annals of Probability , 28(1):132–139, 2000. QUANTILES OF WASSERSTEIN DISTANCE 23

  25. [25]

    On the distribution of the square integral of the Brownian bridge

    Leonid Tolmatz et al. On the distribution of the square integral of the Brownian bridge. The Annals of Probability, 30(1):253–269, 2002

  26. [26]

    Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance

    Jonathan Weed and Francis Bach. Sharp asymptotic and finite-sample rates of convergence of empirical measures in Wasserstein distance. arXiv preprint arXiv:1707.00087 , 2017

  27. [27]

    Gaussian processes for machine learn- ing, volume 2

    Christopher KI Williams and Carl Edward Rasmussen. Gaussian processes for machine learn- ing, volume 2. MIT Press Cambridge, MA, 2006