Finite-sample bounds for regularized optimal transport

Alberto Gonz\'alez-Sanz; Austin J. Stromme; Marcel Nutz

arxiv: 2606.25947 · v1 · pith:OTGUGJFPnew · submitted 2026-06-24 · 🧮 math.ST · stat.TH

Finite-sample bounds for regularized optimal transport

Alberto Gonz\'alez-Sanz , Marcel Nutz , Austin J. Stromme This is my paper

Pith reviewed 2026-06-25 19:14 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords optimal transportregularized optimal transportsample complexityfinite-sample boundsentropic regularizationintrinsic dimensionempirical estimationconvex regularization

0 comments

The pith

Quadratically regularized optimal transport estimates the unregularized cost at rate n to the power minus 2 over d plus 4.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives non-asymptotic bias and variance bounds for the empirical cost of regularized optimal transport under general convex regularizations such as Kullback-Leibler divergence and L^p penalties. The bounds are stated with explicit dependence on the regularization strength and on the intrinsic dimension d of the marginal distributions. These results improve existing guarantees for entropic optimal transport and supply the first fully quantitative finite-sample statements for L^p regularization with 1 less than p less than infinity. For quadratic transport costs the bounds imply that quadratic regularization recovers the unregularized optimal transport cost at the rate n to the power minus 2 over d plus 4.

Core claim

For general convex regularizations the regularized empirical optimal transport cost admits bias and variance bounds that depend explicitly on the regularization parameter and the intrinsic dimension d of the marginals; when the cost is quadratic, quadratic regularization therefore estimates the unregularized optimal transport cost at rate n^{-2/(d+4)}, which is the fastest non-asymptotic rate known for any regularized estimator.

What carries the argument

Non-asymptotic bias and variance bounds on the empirical regularized optimal transport cost, with explicit dependence on regularization parameter and intrinsic dimension.

If this is right

The same bounds apply simultaneously to Kullback-Leibler and to L^p regularizations with 1<p<infty.
The results improve all prior non-asymptotic guarantees for entropic optimal transport.
They supply the first explicit quantitative finite-sample statements for L^p regularization.
The derived rate n^{-2/(d+4)} is the fastest non-asymptotic rate currently available for any estimator based on regularized optimal transport.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The explicit dimension dependence suggests that the method remains useful when data lie on low-dimensional manifolds even if ambient dimension is large.
The general convex-regularization framework could be used to compare the statistical efficiency of different regularizers for a fixed computational budget.
The bias-variance decomposition may guide adaptive choice of the regularization parameter from data.

Load-bearing premise

The marginal distributions have a well-defined finite intrinsic dimension d and the regularization is a general convex function.

What would settle it

A numerical example with known finite d and quadratic cost in which the estimation error for the unregularized cost decays slower than n^{-2/(d+4)} would falsify the rate claim.

read the original abstract

We study the sample complexity of regularized optimal transport for general convex regularizations including the Kullback--Leibler divergence and $L^p$ penalties. Our main results are non-asymptotic bias and variance bounds for the empirical cost, with explicit dependence on the regularization parameter and on the intrinsic dimension of the marginals. Our approach simultaneously improves, unifies, and extends existing finite-sample bounds. In particular, we improve the state of the art for entropic optimal transport, and we obtain the first fully quantitative results for $L^p$ regularization with $1<p<\infty$. For the quadratic transport cost, we deduce that quadratically regularized optimal transport (i.e., $L^2$ regularization) estimates the unregularized optimal transport cost at rate $n^{-2/(d+4)}$, the fastest non-asymptotic rate currently available for any estimator based on regularized optimal transport.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper sharpens non-asymptotic bounds for regularized OT with explicit d and lambda dependence, including first quantitative L^p results.

read the letter

This paper delivers improved non-asymptotic bounds for the error between regularized empirical OT and the unregularized population cost. The main advance is a unified treatment across convex regularizers that gives explicit bias and variance terms depending on the regularization strength and the intrinsic dimension d of the marginals.

It improves the known rate for entropic OT and supplies the first fully quantitative finite-sample results for L^p regularization with 1 < p < infinity. Specializing to quadratic cost and L^2 regularization produces the rate n^{-2/(d+4)}, obtained by balancing the bias term against a variance term controlled by the entropy of the induced function class.

The derivation appears clean. The stress-test note confirms there are no circular steps or hidden uniformity requirements under the stated assumptions of finite d and convex regularizer.

The main limitation is the reliance on finite intrinsic dimension d; if the marginals lack this the rates do not hold. Constants are not discussed in the abstract, so their size and practicality remain to be checked in the proofs.

This is for researchers in optimal transport and high-dimensional statistics who need concrete guidance on sample sizes when using regularized estimators. The explicit dependence on d and lambda makes it directly usable for planning.

Send it to peer review. The claims are specific enough to verify against existing literature and the unification looks like a genuine step forward.

Referee Report

0 major / 3 minor

Summary. The paper derives non-asymptotic upper bounds on the bias and variance of the empirical regularized optimal transport cost for general convex regularizers (including KL and L^p penalties). The bounds are stated with explicit dependence on the regularization strength λ and the intrinsic dimension d of the marginals. Special cases recover improved rates for entropic OT and yield the first fully quantitative guarantees for 1 < p < ∞. For quadratic cost with L^2 regularization the paper obtains the rate n^{-2/(d+4)} by balancing bias and variance terms, claimed to be the fastest non-asymptotic rate among regularized OT estimators.

Significance. If the derivations are correct, the work unifies and strengthens finite-sample theory for regularized OT by controlling variance through the entropy of the function class induced by the regularizer. The explicit dimension dependence and the specialized n^{-2/(d+4)} rate for quadratic regularization constitute a concrete advance over prior non-asymptotic results. The paper's approach of balancing an explicit bias term against a variance term controlled by entropy is a methodological strength that yields parameter-dependent rates without hidden uniformity assumptions.

minor comments (3)

[§1] §1 (Introduction): the claim that the n^{-2/(d+4)} rate is 'the fastest non-asymptotic rate currently available' would benefit from an explicit side-by-side comparison (in a table or paragraph) with the best previously published rates for entropic and other regularized estimators.
[Main theorems] Theorem statements (e.g., the general bias-variance bound): the dependence on the intrinsic dimension d is stated via covering numbers or entropy integrals; a short remark clarifying whether d is the ambient or the manifold dimension would prevent misinterpretation.
[Notation] Notation section: the definition of the regularized empirical cost uses both λ and the regularizer φ; a single displayed equation collecting all symbols would improve readability for readers tracking the bias-variance tradeoff.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and for recommending minor revision. The report does not list any specific major comments.

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper derives non-asymptotic bias-variance bounds for the empirical regularized OT cost under finite intrinsic dimension d and general convex regularizers. The rate n^{-2/(d+4)} for quadratic cost with L^2 regularization follows from explicitly balancing the regularization bias term against the variance term whose dimension dependence enters via entropy of the induced function class. This balancing uses only the stated assumptions and contains no reduction of a claimed prediction to a fitted input, no self-definitional steps, and no load-bearing self-citations. The work is a direct theoretical derivation with explicit quantitative dependence on the premises; the central claim does not collapse to its inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the central claims rest on the convexity of the regularizer and the existence of a finite intrinsic dimension d for the marginals; no free parameters are fitted inside the bounds themselves and no new entities are introduced.

axioms (2)

domain assumption The regularization is a general convex function.
Abstract states results for general convex regularizations including KL and L^p penalties.
domain assumption The marginal distributions have finite intrinsic dimension d.
Bounds are stated to depend explicitly on the intrinsic dimension of the marginals.

pith-pipeline@v0.9.1-grok · 5682 in / 1506 out tokens · 55790 ms · 2026-06-25T19:14:49.284307+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

54 extracted references · 3 linked inside Pith

[1]

Balakrishnan, T

S. Balakrishnan, T. Manole, and L. Wasserman. Statistical inference for optimal transport maps: Recent advances and perspectives.arXiv:2506.19025, 2025

arXiv 2025
[2]

Bayraktar, S

E. Bayraktar, S. Eckstein, and X. Zhang. Stability and sample complexity of divergence regu- larized optimal transport.Bernoulli, 31(1):213–239, 2025

2025
[3]

Bigot, E

J. Bigot, E. Cazelles, and N. Papadakis. Central limit theorems for entropy-regularized op- timal transport on finite spaces and statistical applications.Electronic Journal of Statistics, 13(2):5120–5150, 2019

2019
[4]

Blondel, V

M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. volume 84 of Proceedings of Machine Learning Research, pages 880–889, 2018

2018
[5]

Boucheron, G

S. Boucheron, G. Lugosi, and P. Massart.Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Feb. 2013

2013
[6]

G. Carlier. On the linear convergence of the multimarginal Sinkhorn algorithm.SIAM J. Optim., 32(2):786–794, 2022

2022
[7]

Chewi, J

S. Chewi, J. Niles-Weed, and P. Rigollet.Statistical Optimal Transport. Springer, 2025

2025
[8]

Chizat, A

L. Chizat, A. Delalande, and T. Vaškevičius. Sharper exponential convergence rates for Sinkhorn’s algorithm in continuous settings.Math. Program., 215(1-2):809–858, 2026

2026
[9]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors,Advances in Neural Infor- mation Processing Systems, volume 26. Curran Associates, Inc., 2013

2013
[10]

del Barrio, A

E. del Barrio, A. González-Sanz, and J.-M. Loubes. Central limit theorems for general trans- portation costs.Ann. Inst. Henri Poincaré Probab. Stat., 60(2):847–873, 2024

2024
[11]

del Barrio, A

E. del Barrio, A. González-Sanz, and J.-M. Loubes. Central limit theorems for semi-discrete Wasserstein distances.Bernoulli, 30(1):554–580, 2024

2024
[12]

del Barrio, A

E. del Barrio, A. González-Sanz, J.-M. Loubes, and J. Niles-Weed. An improved central limit theorem and fast convergence rates for entropic transportation costs.SIAM Journal on Mathe- matics of Data Science, 5(3):639–669, 2023

2023
[13]

del Barrio, A

E. del Barrio, A. González-Sanz, J.-M. Loubes, and D. Rodríguez-Vítores. Distributional limit theory for optimal transport.arXiv:2505.19104, 2025

arXiv 2025
[14]

del Barrio and J.-M

E. del Barrio and J.-M. Loubes. Central limit theorems for empirical transportation cost in general dimension.Ann. Probab., 47(2):926–951, 2019

2019
[15]

R. M. Dudley. The speed of mean Glivenko-Cantelli convergence.Ann. Math. Statist., 40:40–50, 1968

1968
[16]

Eckstein and M

S. Eckstein and M. Nutz. Convergence rates for regularized optimal transport via quantization. Math. Oper. Res., 49(2):1223–1240, 2024. 20

2024
[17]

Essid and J

M. Essid and J. Solomon. Quadratically regularized optimal transport on graphs.SIAM J. Sci. Comput., 40(4):A1961–A1986, 2018

2018
[18]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure.Probab. Theory Related Fields, 162(3-4):707–738, 2015

2015
[19]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices.Linear Algebra Appl., 114/115:717–735, 1989

1989
[20]

García Trillos, M

N. García Trillos, M. Gerlach, M. Hein, and D. Slepčev. Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace-Beltrami operator. Found. Comput. Math., 20(4):827–887, 2020

2020
[21]

García Trillos and D

N. García Trillos and D. Slepčev. On the rate of convergence of empirical measures in∞- transportation distance.Canad. J. Math., 67(6):1358–1383, 2015

2015
[22]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyré. Sample complexity of Sinkhorn divergences. In K. Chaudhuri and M. Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 1574–1583. PMLR, 16–18 Apr 2019

2019
[23]

Goldfeld, K

Z. Goldfeld, K. Kato, G. Rioux, and R. Sadhu. Limit theorems for entropic optimal transport maps and Sinkhorn divergence.Electron. J. Statist., 18:980–1041, 2024

2024
[24]

Goldfeld, K

Z. Goldfeld, K. Kato, G. Rioux, and R. Sadhu. Statistical inference with regularized optimal transport.Inf. Inference J. IMA, 13(1):iaad056, 2024

2024
[25]

González-Sanz, E

A. González-Sanz, E. del Barrio, and M. Nutz. Sample complexity of quadratically regularized optimal transport.arXiv:2511.09807, 2025

arXiv 2025
[26]

González-Sanz, R

A. González-Sanz, R. S. Gvalani, and L. Koch. Sharp local sparsity of regularized optimal transport.arXiv:2604.00843, 2026

Pith/arXiv arXiv 2026
[27]

González-Sanz and M

A. González-Sanz and M. Nutz. Sparsity of quadratically regularized optimal transport: Scalar case, 2024. SIAM J. Math. Anal., forthcoming

2024
[28]

González-Sanz, M

A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Linear convergence of gradient descent for quadratically regularized optimal transport.Preprint arXiv:2509.08547, 2025

Pith/arXiv arXiv 2025
[29]

González-Sanz, M

A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Polyak–Łojasiewicz inequality for quadratically regularized optimal transport.SIAM J. Optim., 2026. forthcoming, arXiv:2605.27175

Pith/arXiv arXiv 2026
[30]

González-Sanz, S

A. González-Sanz, S. Eckstein, and M. Nutz. Sparse regularized optimal transport without curse of dimensionality.Preprint arXiv:2505.04721, 2025

arXiv 2025
[31]

González-Sanz and S

A. González-Sanz and S. Hundrieser. Weak limits for empirical entropic optimal transport: Beyond smooth costs.arXiv:2305.09745, 2023

arXiv 2023
[32]

González-Sanz, J.-M

A. González-Sanz, J.-M. Loubes, and J. Niles-Weed. Weak limits of entropy regularized optimal transport; potentials, plans and divergences.arXiv:2207.07427, 2024

arXiv 2024
[33]

Graf and H

S. Graf and H. Luschgy.Foundations of Quantization for Probability Distributions. Springer Berlin Heidelberg, 2000

2000
[34]

Groppe and S

M. Groppe and S. Hundrieser. Lower complexity adaptation for empirical entropic optimal transport.J. Mach. Learn. Res., 25:Paper No. 344, 55, 2024

2024
[35]

Hundrieser, T

S. Hundrieser, T. Staudt, and A. Munk. Empirical optimal transport between different measures adapts to lower complexity.Ann. Inst. Henri Poincaré Probab. Stat., 60(2):824–846, 2024

2024
[36]

Li and X

P. Li and X. Chen. Sample complexity and weak limits of nonsmooth multimarginal Schrödinger system with application to optimal transport barycenter.arXiv:2502.02726, 2025. 21

arXiv 2025
[37]

Liu, J.-G

A. Liu, J.-G. Liu, and Y. Lu. On the rate of convergence of empirical measure in∞-Wasserstein distance for unbounded density function.Quart. Appl. Math., 77(4):811–829, 2019

2019
[38]

D. A. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport.Appl. Math. Optim., 83(3):1919–1949, 2021

1919
[39]

Mena and J

G. Mena and J. Niles-Weed. Statistical bounds for entropic optimal transport: sample complex- ity and the central limit theorem. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

2019
[40]

Muzellec, R

B. Muzellec, R. Nock, G. Patrini, and F. Nielsen. Tsallis regularized optimal transport and ecological inference.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017

2017
[41]

Niles-Weed and P

J. Niles-Weed and P. Rigollet. Estimation of Wasserstein distances in the spiked transport model.Bernoulli, 28(4):2663–2688, 2022

2022
[42]

M. Nutz. Quadratically regularized optimal transport: existence and multiplicity of potentials. SIAM J. Math. Anal., 57(3):2622–2649, 2025

2025
[43]

Peyré and M

G. Peyré and M. Cuturi. Computational optimal transport: With applications to data science. Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019

2019
[44]

Rigollet and A

P. Rigollet and A. J. Stromme. On the sample complexity of entropic optimal transport.Ann. Statist., 53(1):61–90, 2025

2025
[45]

Sadhu, Z

R. Sadhu, Z. Goldfeld, and K. Kato. Stability and statistical inference for semidiscrete optimal transport maps.Ann. Appl. Probab., 34(6):5694–5736, 2024

2024
[46]

Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Non- linear Differential Equations and their Applications

F. Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Non- linear Differential Equations and their Applications. Birkhäuser/Springer, Cham, 2015. Calculus of variations, PDEs, and modeling

2015
[47]

A. J. Stromme. Minimum intrinsic dimension scaling for entropic optimal transport. arXiv:2306.03398, 2023

arXiv 2023
[48]

A. W. van der Vaart and J. A. Wellner.Weak Convergence and Empirical Processes. Springer, 1996

1996
[49]

Villani.Optimal Transport: Old and New

C. Villani.Optimal Transport: Old and New. Springer-Verlag, Berlin, 2009

2009
[50]

Wiesel and X

J. Wiesel and X. Xu. Sparsity of quadratically regularized optimal transport: Bounds on con- centration and bias.SIAM J. Math. Anal., 57(6):6498–6521, 2025. A Proofs of Section 2 Proof of Proposition 2.3.Thisfollowsfromstandardargumentsasintheproofsof[30, Propo- sition 2.3, Theorem 3.2] (which are in turn partially based on [2]). Note that while [30] as- ...

2025
[51]

Sinceψis convex, it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≤ bf(X 1)− bf(X ′ 1) n + 1 n2 nX j=1 n bf(X ′ 1)− bf(X 1)−c(X ′ 1, Yj) +c(X 1, Yj) o ψ′ bf(X ′

+bg(Yj)−c(X ′ 1, Yj) ε −ψ bf(X 1) +bg(Yj)−c(X 1, Yj) ε o + bf(X 1)− bf(X ′ 1) n . Sinceψis convex, it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≤ bf(X 1)− bf(X ′ 1) n + 1 n2 nX j=1 n bf(X ′ 1)− bf(X 1)−c(X ′ 1, Yj) +c(X 1, Yj) o ψ′ bf(X ′
[52]

24 Choosing the pointwise extension ofbfgiven by Proposition 2.3(vi), the empirical first-order condition holds atX′

+bg(Yj)−c(X ′ 1, Yj) ε . 24 Choosing the pointwise extension ofbfgiven by Proposition 2.3(vi), the empirical first-order condition holds atX′
[53]

Using this optimality condition and the boundedness ofc(see Propo- sition 2.3), it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1)) ≤ 1 n2 nX j=1 ψ′ bf(X ′
[54]

bf(X i) +bg(Yj)−c(X i, Yj) ε q−2 + + 1 # . 43 Consequently, 1 n nX i=1 π(1) i,j ≤(q−1)C q

+bg(Yj)−c(X ′ 1, Yj) ε (c(X1, Yj)−c(X ′ 1, Yj))≤ 2∥c∥∞ n . By symmetry, this impliesbΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≥ − 2∥c∥∞ n . We conclude thatA≤ 4∥c∥2 ∞ n2 , and analogously,B≤ 4∥c∥2 ∞ n2 . The claim follows. Proof of Lemma 3.6.Fix(x 0, y0)∈Ω×Ω ′. By Proposition 2.3, the function x7→f ε(x) +g ε(y0)−c(x, y 0) is Lipschitz with constant2L. Hence, f...

[1] [1]

Balakrishnan, T

S. Balakrishnan, T. Manole, and L. Wasserman. Statistical inference for optimal transport maps: Recent advances and perspectives.arXiv:2506.19025, 2025

arXiv 2025

[2] [2]

Bayraktar, S

E. Bayraktar, S. Eckstein, and X. Zhang. Stability and sample complexity of divergence regu- larized optimal transport.Bernoulli, 31(1):213–239, 2025

2025

[3] [3]

Bigot, E

J. Bigot, E. Cazelles, and N. Papadakis. Central limit theorems for entropy-regularized op- timal transport on finite spaces and statistical applications.Electronic Journal of Statistics, 13(2):5120–5150, 2019

2019

[4] [4]

Blondel, V

M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. volume 84 of Proceedings of Machine Learning Research, pages 880–889, 2018

2018

[5] [5]

Boucheron, G

S. Boucheron, G. Lugosi, and P. Massart.Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, Feb. 2013

2013

[6] [6]

G. Carlier. On the linear convergence of the multimarginal Sinkhorn algorithm.SIAM J. Optim., 32(2):786–794, 2022

2022

[7] [7]

Chewi, J

S. Chewi, J. Niles-Weed, and P. Rigollet.Statistical Optimal Transport. Springer, 2025

2025

[8] [8]

Chizat, A

L. Chizat, A. Delalande, and T. Vaškevičius. Sharper exponential convergence rates for Sinkhorn’s algorithm in continuous settings.Math. Program., 215(1-2):809–858, 2026

2026

[9] [9]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In C. Burges, L. Bottou, M. Welling, Z. Ghahramani, and K. Weinberger, editors,Advances in Neural Infor- mation Processing Systems, volume 26. Curran Associates, Inc., 2013

2013

[10] [10]

del Barrio, A

E. del Barrio, A. González-Sanz, and J.-M. Loubes. Central limit theorems for general trans- portation costs.Ann. Inst. Henri Poincaré Probab. Stat., 60(2):847–873, 2024

2024

[11] [11]

del Barrio, A

E. del Barrio, A. González-Sanz, and J.-M. Loubes. Central limit theorems for semi-discrete Wasserstein distances.Bernoulli, 30(1):554–580, 2024

2024

[12] [12]

del Barrio, A

E. del Barrio, A. González-Sanz, J.-M. Loubes, and J. Niles-Weed. An improved central limit theorem and fast convergence rates for entropic transportation costs.SIAM Journal on Mathe- matics of Data Science, 5(3):639–669, 2023

2023

[13] [13]

del Barrio, A

E. del Barrio, A. González-Sanz, J.-M. Loubes, and D. Rodríguez-Vítores. Distributional limit theory for optimal transport.arXiv:2505.19104, 2025

arXiv 2025

[14] [14]

del Barrio and J.-M

E. del Barrio and J.-M. Loubes. Central limit theorems for empirical transportation cost in general dimension.Ann. Probab., 47(2):926–951, 2019

2019

[15] [15]

R. M. Dudley. The speed of mean Glivenko-Cantelli convergence.Ann. Math. Statist., 40:40–50, 1968

1968

[16] [16]

Eckstein and M

S. Eckstein and M. Nutz. Convergence rates for regularized optimal transport via quantization. Math. Oper. Res., 49(2):1223–1240, 2024. 20

2024

[17] [17]

Essid and J

M. Essid and J. Solomon. Quadratically regularized optimal transport on graphs.SIAM J. Sci. Comput., 40(4):A1961–A1986, 2018

2018

[18] [18]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure.Probab. Theory Related Fields, 162(3-4):707–738, 2015

2015

[19] [19]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices.Linear Algebra Appl., 114/115:717–735, 1989

1989

[20] [20]

García Trillos, M

N. García Trillos, M. Gerlach, M. Hein, and D. Slepčev. Error estimates for spectral convergence of the graph Laplacian on random geometric graphs toward the Laplace-Beltrami operator. Found. Comput. Math., 20(4):827–887, 2020

2020

[21] [21]

García Trillos and D

N. García Trillos and D. Slepčev. On the rate of convergence of empirical measures in∞- transportation distance.Canad. J. Math., 67(6):1358–1383, 2015

2015

[22] [22]

Genevay, L

A. Genevay, L. Chizat, F. Bach, M. Cuturi, and G. Peyré. Sample complexity of Sinkhorn divergences. In K. Chaudhuri and M. Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceedings of Machine Learning Research, pages 1574–1583. PMLR, 16–18 Apr 2019

2019

[23] [23]

Goldfeld, K

Z. Goldfeld, K. Kato, G. Rioux, and R. Sadhu. Limit theorems for entropic optimal transport maps and Sinkhorn divergence.Electron. J. Statist., 18:980–1041, 2024

2024

[24] [24]

Goldfeld, K

Z. Goldfeld, K. Kato, G. Rioux, and R. Sadhu. Statistical inference with regularized optimal transport.Inf. Inference J. IMA, 13(1):iaad056, 2024

2024

[25] [25]

González-Sanz, E

A. González-Sanz, E. del Barrio, and M. Nutz. Sample complexity of quadratically regularized optimal transport.arXiv:2511.09807, 2025

arXiv 2025

[26] [26]

González-Sanz, R

A. González-Sanz, R. S. Gvalani, and L. Koch. Sharp local sparsity of regularized optimal transport.arXiv:2604.00843, 2026

Pith/arXiv arXiv 2026

[27] [27]

González-Sanz and M

A. González-Sanz and M. Nutz. Sparsity of quadratically regularized optimal transport: Scalar case, 2024. SIAM J. Math. Anal., forthcoming

2024

[28] [28]

González-Sanz, M

A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Linear convergence of gradient descent for quadratically regularized optimal transport.Preprint arXiv:2509.08547, 2025

Pith/arXiv arXiv 2025

[29] [29]

González-Sanz, M

A. González-Sanz, M. Nutz, and A. Riveros Valdevenito. Polyak–Łojasiewicz inequality for quadratically regularized optimal transport.SIAM J. Optim., 2026. forthcoming, arXiv:2605.27175

Pith/arXiv arXiv 2026

[30] [30]

González-Sanz, S

A. González-Sanz, S. Eckstein, and M. Nutz. Sparse regularized optimal transport without curse of dimensionality.Preprint arXiv:2505.04721, 2025

arXiv 2025

[31] [31]

González-Sanz and S

A. González-Sanz and S. Hundrieser. Weak limits for empirical entropic optimal transport: Beyond smooth costs.arXiv:2305.09745, 2023

arXiv 2023

[32] [32]

González-Sanz, J.-M

A. González-Sanz, J.-M. Loubes, and J. Niles-Weed. Weak limits of entropy regularized optimal transport; potentials, plans and divergences.arXiv:2207.07427, 2024

arXiv 2024

[33] [33]

Graf and H

S. Graf and H. Luschgy.Foundations of Quantization for Probability Distributions. Springer Berlin Heidelberg, 2000

2000

[34] [34]

Groppe and S

M. Groppe and S. Hundrieser. Lower complexity adaptation for empirical entropic optimal transport.J. Mach. Learn. Res., 25:Paper No. 344, 55, 2024

2024

[35] [35]

Hundrieser, T

S. Hundrieser, T. Staudt, and A. Munk. Empirical optimal transport between different measures adapts to lower complexity.Ann. Inst. Henri Poincaré Probab. Stat., 60(2):824–846, 2024

2024

[36] [36]

Li and X

P. Li and X. Chen. Sample complexity and weak limits of nonsmooth multimarginal Schrödinger system with application to optimal transport barycenter.arXiv:2502.02726, 2025. 21

arXiv 2025

[37] [37]

Liu, J.-G

A. Liu, J.-G. Liu, and Y. Lu. On the rate of convergence of empirical measure in∞-Wasserstein distance for unbounded density function.Quart. Appl. Math., 77(4):811–829, 2019

2019

[38] [38]

D. A. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport.Appl. Math. Optim., 83(3):1919–1949, 2021

1919

[39] [39]

Mena and J

G. Mena and J. Niles-Weed. Statistical bounds for entropic optimal transport: sample complex- ity and the central limit theorem. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019

2019

[40] [40]

Muzellec, R

B. Muzellec, R. Nock, G. Patrini, and F. Nielsen. Tsallis regularized optimal transport and ecological inference.Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017

2017

[41] [41]

Niles-Weed and P

J. Niles-Weed and P. Rigollet. Estimation of Wasserstein distances in the spiked transport model.Bernoulli, 28(4):2663–2688, 2022

2022

[42] [42]

M. Nutz. Quadratically regularized optimal transport: existence and multiplicity of potentials. SIAM J. Math. Anal., 57(3):2622–2649, 2025

2025

[43] [43]

Peyré and M

G. Peyré and M. Cuturi. Computational optimal transport: With applications to data science. Foundations and Trends in Machine Learning, 11(5-6):355–607, 2019

2019

[44] [44]

Rigollet and A

P. Rigollet and A. J. Stromme. On the sample complexity of entropic optimal transport.Ann. Statist., 53(1):61–90, 2025

2025

[45] [45]

Sadhu, Z

R. Sadhu, Z. Goldfeld, and K. Kato. Stability and statistical inference for semidiscrete optimal transport maps.Ann. Appl. Probab., 34(6):5694–5736, 2024

2024

[46] [46]

Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Non- linear Differential Equations and their Applications

F. Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Non- linear Differential Equations and their Applications. Birkhäuser/Springer, Cham, 2015. Calculus of variations, PDEs, and modeling

2015

[47] [47]

A. J. Stromme. Minimum intrinsic dimension scaling for entropic optimal transport. arXiv:2306.03398, 2023

arXiv 2023

[48] [48]

A. W. van der Vaart and J. A. Wellner.Weak Convergence and Empirical Processes. Springer, 1996

1996

[49] [49]

Villani.Optimal Transport: Old and New

C. Villani.Optimal Transport: Old and New. Springer-Verlag, Berlin, 2009

2009

[50] [50]

Wiesel and X

J. Wiesel and X. Xu. Sparsity of quadratically regularized optimal transport: Bounds on con- centration and bias.SIAM J. Math. Anal., 57(6):6498–6521, 2025. A Proofs of Section 2 Proof of Proposition 2.3.Thisfollowsfromstandardargumentsasintheproofsof[30, Propo- sition 2.3, Theorem 3.2] (which are in turn partially based on [2]). Note that while [30] as- ...

2025

[51] [51]

Sinceψis convex, it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≤ bf(X 1)− bf(X ′ 1) n + 1 n2 nX j=1 n bf(X ′ 1)− bf(X 1)−c(X ′ 1, Yj) +c(X 1, Yj) o ψ′ bf(X ′

+bg(Yj)−c(X ′ 1, Yj) ε −ψ bf(X 1) +bg(Yj)−c(X 1, Yj) ε o + bf(X 1)− bf(X ′ 1) n . Sinceψis convex, it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≤ bf(X 1)− bf(X ′ 1) n + 1 n2 nX j=1 n bf(X ′ 1)− bf(X 1)−c(X ′ 1, Yj) +c(X 1, Yj) o ψ′ bf(X ′

[52] [52]

24 Choosing the pointwise extension ofbfgiven by Proposition 2.3(vi), the empirical first-order condition holds atX′

+bg(Yj)−c(X ′ 1, Yj) ε . 24 Choosing the pointwise extension ofbfgiven by Proposition 2.3(vi), the empirical first-order condition holds atX′

[53] [53]

Using this optimality condition and the boundedness ofc(see Propo- sition 2.3), it follows that bΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1)) ≤ 1 n2 nX j=1 ψ′ bf(X ′

[54] [54]

bf(X i) +bg(Yj)−c(X i, Yj) ε q−2 + + 1 # . 43 Consequently, 1 n nX i=1 π(1) i,j ≤(q−1)C q

+bg(Yj)−c(X ′ 1, Yj) ε (c(X1, Yj)−c(X ′ 1, Yj))≤ 2∥c∥∞ n . By symmetry, this impliesbΦϕ,ε(bf ,bg)−bΦ(1) ϕ,ε(bf (1),bg(1))≥ − 2∥c∥∞ n . We conclude thatA≤ 4∥c∥2 ∞ n2 , and analogously,B≤ 4∥c∥2 ∞ n2 . The claim follows. Proof of Lemma 3.6.Fix(x 0, y0)∈Ω×Ω ′. By Proposition 2.3, the function x7→f ε(x) +g ε(y0)−c(x, y 0) is Lipschitz with constant2L. Hence, f...