Provably convergent stochastic fixed-point algorithm for free-support Wasserstein barycenter of continuous non-parametric measures

Ariel Neufeld; Qikun Xiang; Zeyi Chen

arxiv: 2505.24384 · v2 · submitted 2025-05-30 · 🧮 math.OC · cs.NA· math.NA· math.PR

Provably convergent stochastic fixed-point algorithm for free-support Wasserstein barycenter of continuous non-parametric measures

Zeyi Chen , Ariel Neufeld , Qikun Xiang This is my paper

Pith reviewed 2026-05-19 13:24 UTC · model grok-4.3

classification 🧮 math.OC cs.NAmath.NAmath.PR

keywords Wasserstein barycenterstochastic fixed-pointoptimal transport mapconvergence analysisnon-parametric measuresentropic regularizationCaffarelli regularity

0 comments

The pith

A stochastic fixed-point iteration converges almost surely to the 2-Wasserstein barycenter of continuous non-parametric measures when optimal transport map errors remain controlled.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds an estimator-driven version of an existing fixed-point scheme for finding the barycenter that averages several continuous probability measures under the 2-Wasserstein distance. It supplies the first almost-sure convergence proof for any practical, implementable stochastic version of that scheme. The proof rests on replacing exact optimal transport maps with a modified entropic estimator whose error can be bounded. The resulting algorithm stays tractable for measures that obey Caffarelli-type regularity, a dense subclass of the Wasserstein space. The authors also supply a synthetic benchmark generator that produces input measures whose approximate barycenters are known, allowing direct numerical checks of accuracy.

Core claim

We develop an estimator-based stochastic fixed-point framework for approximately computing the 2-Wasserstein barycenter of continuous, non-parametric probability measures. We establish almost sure convergence of the iterates and identify sufficient conditions for geometric rates of convergence under controlled errors in optimal transport map estimation. The framework is realized by a concrete algorithm that employs a modified entropic OT map estimator and applies to input measures satisfying Caffarelli-type regularity conditions.

What carries the argument

Stochastic fixed-point iteration that replaces exact optimal transport maps with a modified entropic estimator whose approximation error is controlled at each step.

If this is right

The algorithm produces a sequence of measures that converges almost surely to the true barycenter.
Geometric convergence holds once the per-step OT map error is kept below a paper-derived tolerance.
The method extends to any collection of continuous measures inside the dense Caffarelli-regular subset of the Wasserstein space.
Benchmark instances with known approximate barycenters can be generated to test estimation accuracy and sampling quality.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same controlled-error argument may apply to other fixed-point schemes that rely on transport maps, such as those arising in distributionally robust optimization.
Because the regularity set is dense, the algorithm can serve as a practical default for many real-world continuous measures even when exact regularity is not verified.
The synthetic benchmark procedure offers a template for constructing test suites that compare multiple barycenter methods on identical, non-trivial instances.

Load-bearing premise

The input measures obey Caffarelli-type regularity conditions so that the entropic optimal transport map estimator has controlled error.

What would settle it

Numerical runs on measures that violate Caffarelli regularity in which the stochastic iterates diverge or fail to approach the known barycenter once the allowed estimation error exceeds the derived threshold.

read the original abstract

We develop an estimator-based stochastic fixed-point framework for approximately computing the 2-Wasserstein barycenter of continuous, non-parametric probability measures. Notably, we provide the first rigorous convergence analysis for implementable estimator-based stochastic extensions of the fixed-point iterative scheme proposed by \'Alvarez-Esteban, del Barrio, Cuesta-Albertos, and Matr\'an (2016). In particular, we establish almost sure convergence, and identify sufficient conditions for geometric rates of convergence under controlled errors in optimal transport (OT) map estimation. We subsequently propose a concrete, provably convergent, and computationally tractable stochastic algorithm that accommodates input measures satisfying Caffarelli-type regularity conditions, which form a dense subset of the Wasserstein space. This algorithm leverages a modified entropic OT map estimator to enable efficient and scalable implementation. To facilitate quantitative evaluation, we further propose a novel and efficient procedure for synthetically generating benchmark instances, in which the input measures exhibit non-trivial features and the corresponding barycenters are approximately known. Numerical experiments on both synthetic and real-world datasets demonstrate the strong computational efficiency, estimation accuracy, and sampling flexibility of our approach.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper develops an estimator-based stochastic fixed-point framework for computing the 2-Wasserstein barycenter of continuous non-parametric probability measures. It provides the first rigorous almost-sure convergence analysis for implementable stochastic extensions of the 2016 fixed-point scheme of Alvarez-Esteban et al., under controlled errors in optimal transport map estimation, and identifies sufficient conditions for geometric convergence rates. A concrete algorithm is proposed that uses a modified entropic OT map estimator for input measures satisfying Caffarelli-type regularity (a dense subset of the Wasserstein space), together with a synthetic benchmark generation procedure and numerical experiments on synthetic and real data.

Significance. If the central convergence result holds, the work supplies the first provably convergent, computationally tractable stochastic algorithm for free-support Wasserstein barycenters of continuous measures, extending the 2016 deterministic fixed-point iteration to the estimator-based setting while preserving almost-sure convergence. The explicit sufficient conditions for geometric rates and the benchmark-generation procedure are useful additions to the optimal transport and stochastic approximation literature.

major comments (1)

[Convergence analysis section / main convergence theorem] The almost-sure convergence of the stochastic fixed-point iteration (presumably Theorem 3.1 or the main result in the convergence analysis section) requires that the cumulative OT-map estimation error sequence satisfy a summability condition (e.g., ∑ ε_n < ∞ a.s.) for the invoked stochastic approximation theorem to yield a.s. convergence rather than convergence in probability. The paper invokes Caffarelli regularity to guarantee that the modified entropic OT map estimator has controlled error, yet no explicit quantitative bound is supplied showing that the error (as a function of regularization parameter, sample size, and dimension) decays sufficiently rapidly to meet this summability requirement uniformly over the regularity class. If the error decays only like 1/√n or slower, the a.s. claim may not hold.

minor comments (2)

[Numerical experiments / benchmark section] The synthetic benchmark generation procedure is described as producing instances with 'approximately known' barycenters; a short paragraph clarifying the validation method (e.g., how the reference barycenter is computed or approximated) would improve reproducibility.
[Algorithm description] Notation for the modified entropic estimator (regularization parameter, sample sizes) should be introduced consistently in the algorithm description and reused in the error-bound statements.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and the constructive comment on the convergence analysis. We address the point below and indicate the revisions we will make.

read point-by-point responses

Referee: The almost-sure convergence of the stochastic fixed-point iteration (presumably Theorem 3.1 or the main result in the convergence analysis section) requires that the cumulative OT-map estimation error sequence satisfy a summability condition (e.g., ∑ ε_n < ∞ a.s.) for the invoked stochastic approximation theorem to yield a.s. convergence rather than convergence in probability. The paper invokes Caffarelli regularity to guarantee that the modified entropic OT map estimator has controlled error, yet no explicit quantitative bound is supplied showing that the error (as a function of regularization parameter, sample size, and dimension) decays sufficiently rapidly to meet this summability requirement uniformly over the regularity class. If the error decays only like 1/√n or slower, the a.s. claim may not hold.

Authors: We thank the referee for this precise observation. Theorem 3.1 applies a stochastic approximation result whose almost-sure convergence indeed requires the summability condition ∑ ε_n < ∞ a.s. on the cumulative OT-map estimation errors. The manuscript states that Caffarelli regularity ensures the modified entropic estimator produces controlled errors and that the algorithm selects sequences of regularization parameters and sample sizes to meet the summability requirement. We acknowledge, however, that the current version does not supply explicit quantitative bounds on the estimator error (in terms of regularization, sample size, and dimension) that verify summability uniformly over the regularity class. In the revision we will add a dedicated subsection (and supporting appendix) that derives the convergence rate of the modified entropic OT map estimator under Caffarelli conditions. We will then show that polynomial growth of the per-iteration sample size (e.g., n_k = k^3) combined with a suitable decay of the regularization parameter yields an error sequence satisfying ∑ ε_n < ∞ a.s., thereby confirming the almost-sure convergence claim for the concrete algorithm. revision: yes

Circularity Check

0 steps flagged

Convergence analysis extends standard stochastic approximation with external 2016 fixed-point reference

full rationale

The derivation chain invokes standard stochastic approximation and fixed-point theory for a.s. convergence under controlled OT-map errors. The 2016 Alvarez-Esteban et al. scheme is treated as external prior work by non-overlapping authors. Caffarelli regularity is invoked only to justify applicability of a modified entropic estimator with bounded error; this is an enabling assumption rather than a self-referential reduction. No equations or steps reduce the claimed convergence to a fitted parameter or self-citation chain. The central result retains independent mathematical content.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on standard results from optimal transport theory and stochastic approximation; no new free parameters or invented entities are introduced beyond the choice of entropic regularization and step-size sequences that are standard in the field.

axioms (2)

domain assumption Caffarelli-type regularity conditions on the input measures
Invoked to guarantee that the modified entropic OT map estimator has controlled error and that the fixed-point iteration is well-defined.
standard math Existence and uniqueness of the 2-Wasserstein barycenter for the given measures
Standard background result from optimal transport theory used to define the target of the iteration.

pith-pipeline@v0.9.0 · 5746 in / 1338 out tokens · 44025 ms · 2026-05-19T13:24:55.512279+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We establish almost sure convergence... under controlled errors in optimal transport (OT) map estimation... Caffarelli-type regularity conditions
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

modified entropic OT map estimator... Sinkhorn’s algorithm

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

85 extracted references · 85 canonical work pages

[1]

Agueh and G

M. Agueh and G. Carlier. Barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011

work page 2011
[2]

J. M. Altschuler and E. Boix-Adser `a. Wasserstein barycenters are NP-hard to compute.SIAM Journal on Mathematics of Data Science, 4(1):179–203, 2022

work page 2022
[3]

P. C. ´Alvarez-Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matr ´an. A fixed-point approach to barycenters in Wasserstein space.Journal of Mathematical Analysis and Applications, 441(2):744–762, 2016

work page 2016
[4]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savar ´e.Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media, 2008

work page 2008
[5]

Backhoff, J

J. Backhoff, J. Fontbona, G. Rios, and F. Tobar. Stochastic gradient descent for barycenters in Wasserstein space.J. Appl. Probab., 62(1):15–43, 2025

work page 2025
[6]

Bigot, E

J. Bigot, E. Cazelles, and N. Papadakis. Penalization of barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 51(3):2261–2285, 2019

work page 2019
[7]

V . I. Bogachev.Measure theory. Vol. I, II. Springer-Verlag, Berlin, 2007

work page 2007
[8]

L. A. Caffarelli. A localization property of viscosity solutions to the Monge-Amp `ere equation and their strict convexity.Annals of Mathematics, 131(1):129–134, 1990

work page 1990
[9]

L. A. Caffarelli. Some regularity properties of solutions of Monge-Amp`ere equation.Communications on Pure and Applied Mathematics, 44(8-9):965–969, 1991

work page 1991
[10]

L. A. Caffarelli. The regularity of mappings with a convex potential.Journal of the American Mathemat- ical Society, 5(1):99–104, 1992

work page 1992
[11]

L. A. Caffarelli. Boundary regularity of maps with convex potentials–II.Annals of Mathematics, 144(3): 453–496, 1996

work page 1996
[12]

Campbell and T

T. Campbell and T. Broderick. Bayesian coreset construction via greedy iterative geodesic ascent. In International Conference on Machine Learning, pages 698–706. PMLR, 2018

work page 2018
[13]

Carlier, K

G. Carlier, K. Eichinger, and A. Kroshnin. Entropic-Wasserstein barycenters: PDE characterization, reg- ularity, and CLT.SIAM Journal on Mathematical Analysis, 53(5):5880–5914, 2021

work page 2021
[14]

Carlier, A

G. Carlier, A. Delalande, and Q. M ´erigot. Quantitative stability of barycenters in the Wasserstein space. Probab. Theory Related Fields, 188(3-4):1257–1286, 2024

work page 2024
[15]

Carpenter, A

B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell. Stan: A probabilistic programming language.Journal of statistical software, 76: 1–32, 2017

work page 2017
[16]

Chewi, T

S. Chewi, T. Maunu, P. Rigollet, and A. Stromme. Gradient descent algorithms for Bures-Wasserstein barycenters. InProceedings of 33rd Conference on Learning Theory, volume 125, pages 1276–1304. PMLR, 2020

work page 2020
[17]

arXiv preprint arXiv:2407.18163 , volume=

S. Chewi, J. Niles-Weed, and P. Rigollet. Statistical optimal transport.Preprint, arXiv:2407.18163, 2024

work page arXiv 2024
[18]

L. Chizat. Doubly regularized entropic Wasserstein barycenter.Foundations of Computational Mathe- matics, 2025

work page 2025
[19]

Chizat, P

L. Chizat, P. Roussillon, F. L ´eger, F.-X. Vialard, and G. Peyr ´e. Faster Wasserstein distance estimation with the Sinkhorn divergence. InAdvances in Neural Information Processing Systems, volume 33, pages 2257–2269, 2020

work page 2020
[20]

Chizat, A

L. Chizat, A. Delalande, and T. Va ˇskeviˇcius. Sharper exponential convergence rates for Sinkhorn’s algo- rithm in continuous settings.Math. Program., 215(1-2):809–858, 2026

work page 2026
[21]

Claici, E

S. Claici, E. Chien, and J. Solomon. Stochastic Wasserstein barycenters. InProceedings of the 35th International Conference on Machine Learning, volume 80, pages 999–1008. PMLR, 2018

work page 2018
[22]

Cohen, M

S. Cohen, M. Arbel, and M. P. Deisenroth. Estimating barycenters of measures in high dimensions. Preprint, arXiv:2007.07105, 2020

work page arXiv 2007
[23]

A. D. D. Craik. Prehistory of Fa `a di Bruno’s formula.The American Mathematical Monthly, 112(2): 119–130, 2005

work page 2005
[24]

J. A. Cuesta-Albertos, L. R ¨uschendorf, and A. Tuero-D´ıaz. Optimal coupling of multivariate distributions and stochastic processes.J. Multivariate Anal., 46(2):335–361, 1993

work page 1993
[25]

Curmei and G

M. Curmei and G. Hall. Shape-constrained regression using sum of squares polynomials.Operations Research, 73(1):543–559, 2023. PROV ABLY CONVERGENT ALGORITHM FOR FREE-SUPPORT W ASSERSTEIN BARYCENTER 65

work page 2023
[26]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAdvances in Neural Information Processing Systems, volume 26, 2013

work page 2013
[27]

Cuturi and A

M. Cuturi and A. Doucet. Fast computation of Wasserstein barycenters. InInternational conference on machine learning, pages 685–693. PMLR, 2014

work page 2014
[28]

Cuturi and G

M. Cuturi and G. Peyr ´e. Semidual regularized optimal transport.SIAM Review, 60(4):941–965, 2018

work page 2018
[29]

Cuturi, L

M. Cuturi, L. Meng-Papaxanthos, Y . Tian, C. Bunne, G. Davis, and O. Teboul. Optimal transport tools (OTT): A JAX toolbox for all things Wasserstein.Preprint, arXiv:2201.12324, 2022

work page arXiv 2022
[30]

N. Deb, P. Ghosal, and B. Sen. Rates of estimation of optimal transport maps using plug-in estimators via barycentric projections. InAdvances in Neural Information Processing Systems, volume 34, pages 29736–29753, 2021

work page 2021
[31]

A. L. Dontchev and R. T. Rockafellar.Implicit functions and solution mappings: A view from variational analysis. Springer Monographs in Mathematics. Springer, Dordrecht, 2009

work page 2009
[32]

Dvurechensky, A

P. Dvurechensky, A. Gasnikov, and A. Kroshnin. Computational optimal transport: Complexity by accel- erated gradient descent is better than by Sinkhorn’s algorithm. In J. Dy and A. Krause, editors,Proceed- ings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 1367–1376. PMLR, 10–15 Jul 2018

work page 2018
[33]

L. C. Evans.Partial Differential Equations, volume 19 ofGraduate Studies in Mathematics. American Mathematical Society, 2nd edition, 2010

work page 2010
[34]

J. Fan, A. Taghvaei, and Y . Chen. Scalable computations of Wasserstein barycenter via input convex neural networks. InProceedings of the 38th International Conference on Machine Learning, volume 139, pages 1571–1581. PMLR, 2021

work page 2021
[35]

Feydy.Geometric Data Analysis, Beyond Convolutions

J. Feydy.Geometric Data Analysis, Beyond Convolutions. Phd thesis, Universit ´e Paris-Saclay, France, 2020

work page 2020
[36]

Feydy, P

J. Feydy, P. Roussillon, A. Trouv´e, and P. Gori. Fast and scalable optimal transport for brain tractograms. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2019, volume 11767 of Lecture Notes in Computer Science, pages 636–644. Springer, 2019

work page 2019
[37]

Feydy, T

J. Feydy, T. S ´ejourn´e, F.-X. Vialard, S.-i. Amari, A. Trouv ´e, and G. Peyr ´e. Interpolating between opti- mal transport and MMD using Sinkhorn divergences. InThe 22nd international conference on artificial intelligence and statistics, pages 2681–2690. PMLR, 2019

work page 2019
[38]

Flamary, K

R. Flamary, K. Lounici, and A. Ferrari. Concentration bounds for linear Monge mapping estimation and optimal transport domain adaptation.Preprint, arXiv:1905.10155, 2019

work page arXiv 1905
[39]

Flamary, N

R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, et al. POT: Python optimal transport.Journal of Machine Learning Research, 22 (78):1–8, 2021

work page 2021
[40]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. Probability theory and related fields, 162(3-4):707–738, 2015

work page 2015
[41]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices.Linear Algebra Appl., 114/115: 717–735, 1989

work page 1989
[42]

Genevay, M

A. Genevay, M. Cuturi, G. Peyr ´e, and F. Bach. Stochastic optimization for large-scale optimal transport. InAdvances in Neural Information Processing Systems, volume 29, 2016

work page 2016
[43]

Ghosal and M

P. Ghosal and M. Nutz. On the convergence rate of Sinkhorn’s algorithm.Mathematics of Operations Research, 2025

work page 2025
[44]

Ghosal and B

P. Ghosal and B. Sen. Multivariate ranks and quantiles using optimal transport: consistency, rates and nonparametric testing.Ann. Statist., 50(2):1012–1037, 2022

work page 2022
[45]

N. Gigli. On H ¨older continuity-in-time of the optimal transport map towards measures along a curve. Proceedings of the Edinburgh Mathematical Society, 54(2):401–409, 2011

work page 2011
[46]

Gonz ´alez-Sanz, L

A. Gonz ´alez-Sanz, L. De Lara, L. B ´ethune, and J.-M. Loubes. GAN estimation of Lipschitz optimal transport maps.Preprint, arXiv:2202.07965, 2022

work page arXiv 2022
[47]

H ¨utter and P

J.-C. H ¨utter and P. Rigollet. Minimax estimation of smooth optimal transport maps.The Annals of Statistics, 49(2):1166–1194, 2021

work page 2021
[48]

Janati, M

H. Janati, M. Cuturi, and A. Gramfort. Debiased Sinkhorn barycenters. InInternational Conference on Machine Learning, pages 4692–4701. PMLR, 2020

work page 2020
[49]

L. V . Kantorovich. On a problem of Monge.CR (Doklady) Acad. Sci. URSS (NS), 3:225–226, 1948. 66 Z. CHEN, A. NEUFELD, AND Q. XIANG

work page 1948
[50]

H. Karcher. Riemannian center of mass and mollifier smoothing.Communications on Pure and Applied Mathematics, 30(5):509–541, 1977

work page 1977
[51]

K. Kim, R. Yao, C. Zhu, and X. Chen. Optimal transport barycenter via nonconvex-concave minimax optimization. InProceedings of the 42nd International Conference on Machine Learning, volume 267 of Proceedings of Machine Learning Research, pages 30879–30899. PMLR, 2025

work page 2025
[52]

Korotin, V

A. Korotin, V . Egiazarian, L. Li, and E. Burnaev. Wasserstein iterative networks for barycenter estimation. InAdvances in Neural Information Processing Systems, volume 35, pages 15672–15686, 2022

work page 2022
[53]

T. T.-K. Lau and H. Liu. Wasserstein distributionally robust optimization with wasserstein barycenters. Preprint, arXiv:2203.12136, 2022

work page arXiv 2022
[54]

Le Gouic and J.-M

T. Le Gouic and J.-M. Loubes. Existence and consistency of Wasserstein barycenters.Probab. Theory Related Fields, 168(3-4):901–917, 2017

work page 2017
[55]

L. Li, A. Genevay, M. Yurochkin, and J. M. Solomon. Continuous regularized Wasserstein barycenters. InAdvances in Neural Information Processing Systems, volume 33, pages 17755–17765, 2020

work page 2020
[56]

Makkuva, A

A. Makkuva, A. Taghvaei, S. Oh, and J. Lee. Optimal transport mapping via input convex neural networks. InInternational Conference on Machine Learning, pages 6672–6681. PMLR, 2020

work page 2020
[57]

Manole, S

T. Manole, S. Balakrishnan, J. Niles-Weed, and L. Wasserman. Plugin estimation of smooth optimal transport maps.The Annals of Statistics, 52(3):966–998, 2024

work page 2024
[58]

A. J. McNeil, R. Frey, and P. Embrechts.Quantitative risk management: concepts, techniques and tools- revised edition. Princeton University Press, 2015

work page 2015
[59]

Mendelson and A

S. Mendelson and A. Pajor. On singular values of matrices with independent rows.Bernoulli, 12(5): 761–773, 2006

work page 2006
[60]

Minsker, S

S. Minsker, S. Srivastava, L. Lin, and D. Dunson. Scalable and robust bayesian inference via the median posterior. InInternational conference on machine learning, pages 1656–1664. PMLR, 2014

work page 2014
[61]

E. F. Montesuma, Y . Bendou, and M. Gartrell. Computing Wasserstein barycenters through gradient flows. Preprint, arXiv:2510.04602, 2025

work page arXiv 2025
[62]

Muzellec, A

B. Muzellec, A. Vacher, F. Bach, F.-X. Vialard, and A. Rudi. Near-optimal estimation of smooth transport maps with kernel sums-of-squares.Preprint, arXiv:2112.01907, 2021

work page arXiv 2021
[63]

Nesterov.Introductory lectures on convex optimization: A basic course, volume 87

Y . Nesterov.Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2004

work page 2004
[64]

V . M. Panaretos and Y . Zemel.An invitation to statistics in Wasserstein space. Springer, 2020

work page 2020
[65]

F.-P. Paty, A. d’Aspremont, and M. Cuturi. Regularity as regularization: Smooth and strongly convex Brenier potentials in optimal transport. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, volume 108, pages 1222–1232. PMLR, 2020

work page 2020
[66]

E. V . Petracou, A. Xepapadeas, and A. N. Yannacopoulos. Decision making under model uncertainty: Fr´echet–Wasserstein mean preferences.Management Science, 68(2):1195–1211, 2022

work page 2022
[67]

Peyr ´e and M

G. Peyr ´e and M. Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019

work page 2019
[68]

B. T. Polyak. Gradient methods for the minimisation of functionals.USSR Computational Mathematics and Mathematical Physics, 3(4):864–878, 1963

work page 1963
[69]

Pooladian and J

A.-A. Pooladian and J. Niles-Weed. Entropic estimation of optimal transport maps.Preprint, arXiv:2109.12004, 2024

work page arXiv 2024
[70]

Rabin, G

J. Rabin, G. Peyr ´e, J. Delon, and M. Bernot. Wasserstein barycenter and its application to texture mixing. InScale Space and Variational Methods in Computer Vision: Third International Conference, SSVM 2011, Ein-Gedi, Israel, May 29–June 2, 2011, Revised Selected Papers 3, pages 435–446. Springer, 2012

work page 2011
[71]

R. T. Rockafellar.Convex Analysis:(PMS-28). Princeton university press, 1970

work page 1970
[72]

R. T. Rockafellar and R. J.-B. Wets.Variational analysis, volume 317 ofGrundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1998

work page 1998
[73]

Rychener, A

Y . Rychener, A. Esteban-P´erez, J. M. Morales, and D. Kuhn. Wasserstein distributionally robust optimiza- tion with heterogeneous data sources.Preprint, arXiv:2407.13582, 2024

work page arXiv 2024
[74]

Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Nonlinear Differential Equations and their Applications

F. Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Nonlinear Differential Equations and their Applications. Birkh¨auser/Springer, Cham, 2015

work page 2015
[75]

Sinkhorn

R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices.Ann. Math. Statist., 35:876–879, 1964. PROV ABLY CONVERGENT ALGORITHM FOR FREE-SUPPORT W ASSERSTEIN BARYCENTER 67

work page 1964
[76]

Srivastava, V

S. Srivastava, V . Cevher, Q. Dinh, and D. Dunson. W ASP: Scalable Bayes via barycenters of subset poste- riors. InProceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume 38, pages 912–920. PMLR, 2015

work page 2015
[77]

Srivastava, C

S. Srivastava, C. Li, and D. B. Dunson. Scalable Bayes via barycenter in Wasserstein space.Journal of Machine Learning Research, 19(8):1–35, 2018

work page 2018
[78]

Tas ¸kesen, S

B. Tas ¸kesen, S. Shafieezadeh-Abadeh, and D. Kuhn. Semi-discrete optimal transport: hardness, regular- ization and numerical solution.Mathematical Programming, 199:1033–1106, 2023

work page 2023
[79]

Tanguy, J

E. Tanguy, J. Delon, and N. Gozlan. Computing barycentres of measures for generic transport costs. Preprint, arXiv:2501.04016, 2024

work page arXiv 2024
[80]

Tas ¸kesen, S

B. Tas ¸kesen, S. Shafieezadeh-Abadeh, D. Kuhn, and K. Natarajan. Discrete optimal transport with inde- pendent marginals is #P-hard.SIAM Journal on Optimization, 33(2):589–614, 2023

work page 2023

Showing first 80 references.

[1] [1]

Agueh and G

M. Agueh and G. Carlier. Barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011

work page 2011

[2] [2]

J. M. Altschuler and E. Boix-Adser `a. Wasserstein barycenters are NP-hard to compute.SIAM Journal on Mathematics of Data Science, 4(1):179–203, 2022

work page 2022

[3] [3]

P. C. ´Alvarez-Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matr ´an. A fixed-point approach to barycenters in Wasserstein space.Journal of Mathematical Analysis and Applications, 441(2):744–762, 2016

work page 2016

[4] [4]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savar ´e.Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media, 2008

work page 2008

[5] [5]

Backhoff, J

J. Backhoff, J. Fontbona, G. Rios, and F. Tobar. Stochastic gradient descent for barycenters in Wasserstein space.J. Appl. Probab., 62(1):15–43, 2025

work page 2025

[6] [6]

Bigot, E

J. Bigot, E. Cazelles, and N. Papadakis. Penalization of barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 51(3):2261–2285, 2019

work page 2019

[7] [7]

V . I. Bogachev.Measure theory. Vol. I, II. Springer-Verlag, Berlin, 2007

work page 2007

[8] [8]

L. A. Caffarelli. A localization property of viscosity solutions to the Monge-Amp `ere equation and their strict convexity.Annals of Mathematics, 131(1):129–134, 1990

work page 1990

[9] [9]

L. A. Caffarelli. Some regularity properties of solutions of Monge-Amp`ere equation.Communications on Pure and Applied Mathematics, 44(8-9):965–969, 1991

work page 1991

[10] [10]

L. A. Caffarelli. The regularity of mappings with a convex potential.Journal of the American Mathemat- ical Society, 5(1):99–104, 1992

work page 1992

[11] [11]

L. A. Caffarelli. Boundary regularity of maps with convex potentials–II.Annals of Mathematics, 144(3): 453–496, 1996

work page 1996

[12] [12]

Campbell and T

T. Campbell and T. Broderick. Bayesian coreset construction via greedy iterative geodesic ascent. In International Conference on Machine Learning, pages 698–706. PMLR, 2018

work page 2018

[13] [13]

Carlier, K

G. Carlier, K. Eichinger, and A. Kroshnin. Entropic-Wasserstein barycenters: PDE characterization, reg- ularity, and CLT.SIAM Journal on Mathematical Analysis, 53(5):5880–5914, 2021

work page 2021

[14] [14]

Carlier, A

G. Carlier, A. Delalande, and Q. M ´erigot. Quantitative stability of barycenters in the Wasserstein space. Probab. Theory Related Fields, 188(3-4):1257–1286, 2024

work page 2024

[15] [15]

Carpenter, A

B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell. Stan: A probabilistic programming language.Journal of statistical software, 76: 1–32, 2017

work page 2017

[16] [16]

Chewi, T

S. Chewi, T. Maunu, P. Rigollet, and A. Stromme. Gradient descent algorithms for Bures-Wasserstein barycenters. InProceedings of 33rd Conference on Learning Theory, volume 125, pages 1276–1304. PMLR, 2020

work page 2020

[17] [17]

arXiv preprint arXiv:2407.18163 , volume=

S. Chewi, J. Niles-Weed, and P. Rigollet. Statistical optimal transport.Preprint, arXiv:2407.18163, 2024

work page arXiv 2024

[18] [18]

L. Chizat. Doubly regularized entropic Wasserstein barycenter.Foundations of Computational Mathe- matics, 2025

work page 2025

[19] [19]

Chizat, P

L. Chizat, P. Roussillon, F. L ´eger, F.-X. Vialard, and G. Peyr ´e. Faster Wasserstein distance estimation with the Sinkhorn divergence. InAdvances in Neural Information Processing Systems, volume 33, pages 2257–2269, 2020

work page 2020

[20] [20]

Chizat, A

L. Chizat, A. Delalande, and T. Va ˇskeviˇcius. Sharper exponential convergence rates for Sinkhorn’s algo- rithm in continuous settings.Math. Program., 215(1-2):809–858, 2026

work page 2026

[21] [21]

Claici, E

S. Claici, E. Chien, and J. Solomon. Stochastic Wasserstein barycenters. InProceedings of the 35th International Conference on Machine Learning, volume 80, pages 999–1008. PMLR, 2018

work page 2018

[22] [22]

Cohen, M

S. Cohen, M. Arbel, and M. P. Deisenroth. Estimating barycenters of measures in high dimensions. Preprint, arXiv:2007.07105, 2020

work page arXiv 2007

[23] [23]

A. D. D. Craik. Prehistory of Fa `a di Bruno’s formula.The American Mathematical Monthly, 112(2): 119–130, 2005

work page 2005

[24] [24]

J. A. Cuesta-Albertos, L. R ¨uschendorf, and A. Tuero-D´ıaz. Optimal coupling of multivariate distributions and stochastic processes.J. Multivariate Anal., 46(2):335–361, 1993

work page 1993

[25] [25]

Curmei and G

M. Curmei and G. Hall. Shape-constrained regression using sum of squares polynomials.Operations Research, 73(1):543–559, 2023. PROV ABLY CONVERGENT ALGORITHM FOR FREE-SUPPORT W ASSERSTEIN BARYCENTER 65

work page 2023

[26] [26]

M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAdvances in Neural Information Processing Systems, volume 26, 2013

work page 2013

[27] [27]

Cuturi and A

M. Cuturi and A. Doucet. Fast computation of Wasserstein barycenters. InInternational conference on machine learning, pages 685–693. PMLR, 2014

work page 2014

[28] [28]

Cuturi and G

M. Cuturi and G. Peyr ´e. Semidual regularized optimal transport.SIAM Review, 60(4):941–965, 2018

work page 2018

[29] [29]

Cuturi, L

M. Cuturi, L. Meng-Papaxanthos, Y . Tian, C. Bunne, G. Davis, and O. Teboul. Optimal transport tools (OTT): A JAX toolbox for all things Wasserstein.Preprint, arXiv:2201.12324, 2022

work page arXiv 2022

[30] [30]

N. Deb, P. Ghosal, and B. Sen. Rates of estimation of optimal transport maps using plug-in estimators via barycentric projections. InAdvances in Neural Information Processing Systems, volume 34, pages 29736–29753, 2021

work page 2021

[31] [31]

A. L. Dontchev and R. T. Rockafellar.Implicit functions and solution mappings: A view from variational analysis. Springer Monographs in Mathematics. Springer, Dordrecht, 2009

work page 2009

[32] [32]

Dvurechensky, A

P. Dvurechensky, A. Gasnikov, and A. Kroshnin. Computational optimal transport: Complexity by accel- erated gradient descent is better than by Sinkhorn’s algorithm. In J. Dy and A. Krause, editors,Proceed- ings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 1367–1376. PMLR, 10–15 Jul 2018

work page 2018

[33] [33]

L. C. Evans.Partial Differential Equations, volume 19 ofGraduate Studies in Mathematics. American Mathematical Society, 2nd edition, 2010

work page 2010

[34] [34]

J. Fan, A. Taghvaei, and Y . Chen. Scalable computations of Wasserstein barycenter via input convex neural networks. InProceedings of the 38th International Conference on Machine Learning, volume 139, pages 1571–1581. PMLR, 2021

work page 2021

[35] [35]

Feydy.Geometric Data Analysis, Beyond Convolutions

J. Feydy.Geometric Data Analysis, Beyond Convolutions. Phd thesis, Universit ´e Paris-Saclay, France, 2020

work page 2020

[36] [36]

Feydy, P

J. Feydy, P. Roussillon, A. Trouv´e, and P. Gori. Fast and scalable optimal transport for brain tractograms. InMedical Image Computing and Computer Assisted Intervention – MICCAI 2019, volume 11767 of Lecture Notes in Computer Science, pages 636–644. Springer, 2019

work page 2019

[37] [37]

Feydy, T

J. Feydy, T. S ´ejourn´e, F.-X. Vialard, S.-i. Amari, A. Trouv ´e, and G. Peyr ´e. Interpolating between opti- mal transport and MMD using Sinkhorn divergences. InThe 22nd international conference on artificial intelligence and statistics, pages 2681–2690. PMLR, 2019

work page 2019

[38] [38]

Flamary, K

R. Flamary, K. Lounici, and A. Ferrari. Concentration bounds for linear Monge mapping estimation and optimal transport domain adaptation.Preprint, arXiv:1905.10155, 2019

work page arXiv 1905

[39] [39]

Flamary, N

R. Flamary, N. Courty, A. Gramfort, M. Z. Alaya, A. Boisbunon, S. Chambon, L. Chapel, A. Corenflos, K. Fatras, N. Fournier, et al. POT: Python optimal transport.Journal of Machine Learning Research, 22 (78):1–8, 2021

work page 2021

[40] [40]

Fournier and A

N. Fournier and A. Guillin. On the rate of convergence in Wasserstein distance of the empirical measure. Probability theory and related fields, 162(3-4):707–738, 2015

work page 2015

[41] [41]

Franklin and J

J. Franklin and J. Lorenz. On the scaling of multidimensional matrices.Linear Algebra Appl., 114/115: 717–735, 1989

work page 1989

[42] [42]

Genevay, M

A. Genevay, M. Cuturi, G. Peyr ´e, and F. Bach. Stochastic optimization for large-scale optimal transport. InAdvances in Neural Information Processing Systems, volume 29, 2016

work page 2016

[43] [43]

Ghosal and M

P. Ghosal and M. Nutz. On the convergence rate of Sinkhorn’s algorithm.Mathematics of Operations Research, 2025

work page 2025

[44] [44]

Ghosal and B

P. Ghosal and B. Sen. Multivariate ranks and quantiles using optimal transport: consistency, rates and nonparametric testing.Ann. Statist., 50(2):1012–1037, 2022

work page 2022

[45] [45]

N. Gigli. On H ¨older continuity-in-time of the optimal transport map towards measures along a curve. Proceedings of the Edinburgh Mathematical Society, 54(2):401–409, 2011

work page 2011

[46] [46]

Gonz ´alez-Sanz, L

A. Gonz ´alez-Sanz, L. De Lara, L. B ´ethune, and J.-M. Loubes. GAN estimation of Lipschitz optimal transport maps.Preprint, arXiv:2202.07965, 2022

work page arXiv 2022

[47] [47]

H ¨utter and P

J.-C. H ¨utter and P. Rigollet. Minimax estimation of smooth optimal transport maps.The Annals of Statistics, 49(2):1166–1194, 2021

work page 2021

[48] [48]

Janati, M

H. Janati, M. Cuturi, and A. Gramfort. Debiased Sinkhorn barycenters. InInternational Conference on Machine Learning, pages 4692–4701. PMLR, 2020

work page 2020

[49] [49]

L. V . Kantorovich. On a problem of Monge.CR (Doklady) Acad. Sci. URSS (NS), 3:225–226, 1948. 66 Z. CHEN, A. NEUFELD, AND Q. XIANG

work page 1948

[50] [50]

H. Karcher. Riemannian center of mass and mollifier smoothing.Communications on Pure and Applied Mathematics, 30(5):509–541, 1977

work page 1977

[51] [51]

K. Kim, R. Yao, C. Zhu, and X. Chen. Optimal transport barycenter via nonconvex-concave minimax optimization. InProceedings of the 42nd International Conference on Machine Learning, volume 267 of Proceedings of Machine Learning Research, pages 30879–30899. PMLR, 2025

work page 2025

[52] [52]

Korotin, V

A. Korotin, V . Egiazarian, L. Li, and E. Burnaev. Wasserstein iterative networks for barycenter estimation. InAdvances in Neural Information Processing Systems, volume 35, pages 15672–15686, 2022

work page 2022

[53] [53]

T. T.-K. Lau and H. Liu. Wasserstein distributionally robust optimization with wasserstein barycenters. Preprint, arXiv:2203.12136, 2022

work page arXiv 2022

[54] [54]

Le Gouic and J.-M

T. Le Gouic and J.-M. Loubes. Existence and consistency of Wasserstein barycenters.Probab. Theory Related Fields, 168(3-4):901–917, 2017

work page 2017

[55] [55]

L. Li, A. Genevay, M. Yurochkin, and J. M. Solomon. Continuous regularized Wasserstein barycenters. InAdvances in Neural Information Processing Systems, volume 33, pages 17755–17765, 2020

work page 2020

[56] [56]

Makkuva, A

A. Makkuva, A. Taghvaei, S. Oh, and J. Lee. Optimal transport mapping via input convex neural networks. InInternational Conference on Machine Learning, pages 6672–6681. PMLR, 2020

work page 2020

[57] [57]

Manole, S

T. Manole, S. Balakrishnan, J. Niles-Weed, and L. Wasserman. Plugin estimation of smooth optimal transport maps.The Annals of Statistics, 52(3):966–998, 2024

work page 2024

[58] [58]

A. J. McNeil, R. Frey, and P. Embrechts.Quantitative risk management: concepts, techniques and tools- revised edition. Princeton University Press, 2015

work page 2015

[59] [59]

Mendelson and A

S. Mendelson and A. Pajor. On singular values of matrices with independent rows.Bernoulli, 12(5): 761–773, 2006

work page 2006

[60] [60]

Minsker, S

S. Minsker, S. Srivastava, L. Lin, and D. Dunson. Scalable and robust bayesian inference via the median posterior. InInternational conference on machine learning, pages 1656–1664. PMLR, 2014

work page 2014

[61] [61]

E. F. Montesuma, Y . Bendou, and M. Gartrell. Computing Wasserstein barycenters through gradient flows. Preprint, arXiv:2510.04602, 2025

work page arXiv 2025

[62] [62]

Muzellec, A

B. Muzellec, A. Vacher, F. Bach, F.-X. Vialard, and A. Rudi. Near-optimal estimation of smooth transport maps with kernel sums-of-squares.Preprint, arXiv:2112.01907, 2021

work page arXiv 2021

[63] [63]

Nesterov.Introductory lectures on convex optimization: A basic course, volume 87

Y . Nesterov.Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2004

work page 2004

[64] [64]

V . M. Panaretos and Y . Zemel.An invitation to statistics in Wasserstein space. Springer, 2020

work page 2020

[65] [65]

F.-P. Paty, A. d’Aspremont, and M. Cuturi. Regularity as regularization: Smooth and strongly convex Brenier potentials in optimal transport. InProceedings of the 23rd International Conference on Artificial Intelligence and Statistics, volume 108, pages 1222–1232. PMLR, 2020

work page 2020

[66] [66]

E. V . Petracou, A. Xepapadeas, and A. N. Yannacopoulos. Decision making under model uncertainty: Fr´echet–Wasserstein mean preferences.Management Science, 68(2):1195–1211, 2022

work page 2022

[67] [67]

Peyr ´e and M

G. Peyr ´e and M. Cuturi. Computational optimal transport: With applications to data science.Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019

work page 2019

[68] [68]

B. T. Polyak. Gradient methods for the minimisation of functionals.USSR Computational Mathematics and Mathematical Physics, 3(4):864–878, 1963

work page 1963

[69] [69]

Pooladian and J

A.-A. Pooladian and J. Niles-Weed. Entropic estimation of optimal transport maps.Preprint, arXiv:2109.12004, 2024

work page arXiv 2024

[70] [70]

Rabin, G

J. Rabin, G. Peyr ´e, J. Delon, and M. Bernot. Wasserstein barycenter and its application to texture mixing. InScale Space and Variational Methods in Computer Vision: Third International Conference, SSVM 2011, Ein-Gedi, Israel, May 29–June 2, 2011, Revised Selected Papers 3, pages 435–446. Springer, 2012

work page 2011

[71] [71]

R. T. Rockafellar.Convex Analysis:(PMS-28). Princeton university press, 1970

work page 1970

[72] [72]

R. T. Rockafellar and R. J.-B. Wets.Variational analysis, volume 317 ofGrundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer-Verlag, Berlin, 1998

work page 1998

[73] [73]

Rychener, A

Y . Rychener, A. Esteban-P´erez, J. M. Morales, and D. Kuhn. Wasserstein distributionally robust optimiza- tion with heterogeneous data sources.Preprint, arXiv:2407.13582, 2024

work page arXiv 2024

[74] [74]

Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Nonlinear Differential Equations and their Applications

F. Santambrogio.Optimal transport for applied mathematicians, volume 87 ofProgress in Nonlinear Differential Equations and their Applications. Birkh¨auser/Springer, Cham, 2015

work page 2015

[75] [75]

Sinkhorn

R. Sinkhorn. A relationship between arbitrary positive matrices and doubly stochastic matrices.Ann. Math. Statist., 35:876–879, 1964. PROV ABLY CONVERGENT ALGORITHM FOR FREE-SUPPORT W ASSERSTEIN BARYCENTER 67

work page 1964

[76] [76]

Srivastava, V

S. Srivastava, V . Cevher, Q. Dinh, and D. Dunson. W ASP: Scalable Bayes via barycenters of subset poste- riors. InProceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume 38, pages 912–920. PMLR, 2015

work page 2015

[77] [77]

Srivastava, C

S. Srivastava, C. Li, and D. B. Dunson. Scalable Bayes via barycenter in Wasserstein space.Journal of Machine Learning Research, 19(8):1–35, 2018

work page 2018

[78] [78]

Tas ¸kesen, S

B. Tas ¸kesen, S. Shafieezadeh-Abadeh, and D. Kuhn. Semi-discrete optimal transport: hardness, regular- ization and numerical solution.Mathematical Programming, 199:1033–1106, 2023

work page 2023

[79] [79]

Tanguy, J

E. Tanguy, J. Delon, and N. Gozlan. Computing barycentres of measures for generic transport costs. Preprint, arXiv:2501.04016, 2024

work page arXiv 2024

[80] [80]

Tas ¸kesen, S

B. Tas ¸kesen, S. Shafieezadeh-Abadeh, D. Kuhn, and K. Natarajan. Discrete optimal transport with inde- pendent marginals is #P-hard.SIAM Journal on Optimization, 33(2):589–614, 2023

work page 2023