pith. sign in

arxiv: 2601.00632 · v2 · pith:RS5AJSYInew · submitted 2026-01-02 · 🧮 math.OC

Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry

Pith reviewed 2026-05-16 18:27 UTC · model grok-4.3

classification 🧮 math.OC
keywords variational inferenceBures-Wasserstein geometryinteracting particlesconsensus-based optimizationGaussian measureszeroth-order methodsmean-field limit
0
0 comments X

The pith

Interacting Gaussian particles optimize variational inference in the linearized Bures-Wasserstein space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a zeroth-order algorithm that uses a system of interacting Gaussian particles to solve optimization problems over Gaussian measures for variational inference. These particles stochastically explore the space and reach consensus around global minima inside the new Linearized Bures-Wasserstein parametrization, which keeps essential geometric properties while remaining computationally feasible. The approach matters because it avoids gradient computations and shows stronger robustness than deterministic methods when the target distributions are low-dimensional and non-log-concave, according to the reported experiments. The authors prove the particle dynamics are well-posed and examine their convergence through a mean-field limit.

Core claim

The authors introduce the Linearized Bures-Wasserstein space as a tractable parametrization of Gaussian measures and build an interacting particle system that performs consensus-based optimization to locate global minima. They establish well-posedness of the stochastic dynamics and study their convergence properties via a mean-field approximation.

What carries the argument

The Linearized Bures-Wasserstein (LBW) parametrization of Gaussian measures, which enables efficient computations while retaining key geometric features from optimal transport to support the interacting particle dynamics.

If this is right

  • The algorithm converges to global optima in the space of Gaussian measures.
  • Numerical tests show better robustness and performance than deterministic gradient methods on non-log-concave targets.
  • The mean-field limit captures the long-time dynamics of the finite-particle system.
  • Well-posedness of the stochastic particle dynamics holds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar linearizations could extend the method to mixtures of Gaussians or other measure classes.
  • The stochastic consensus mechanism may help with multimodal posteriors common in Bayesian settings.
  • Direct comparisons to other particle-based variational inference methods would clarify relative strengths.

Load-bearing premise

The linearized Bures-Wasserstein parametrization preserves enough of the original geometry for the consensus mechanism to reach global minima, and the mean-field limit accurately describes the long-time behavior of the finite-particle system.

What would settle it

Numerical runs on the same low-dimensional non-log-concave targets where the particle system fails to converge to the reported global minima or performs no better than gradient methods would falsify the performance advantage.

Figures

Figures reproduced from arXiv: 2601.00632 by Giacomo Borghi, Jos\'e A. Carrillo.

Figure 1
Figure 1. Figure 1: Evolution of N = 10 Gaussian particles for minimization of E = KL divergence from target bi-modal measure (contour lines). Particles evolve according to the CBO-type dynamics we design in this paper for problems of type (1.2) (see Section 4 for the definition). Final plot compares the solution computed by the CBO algorithm with the one of BW Gradient Flow algorithm [36]. Corresponding KL values are also sh… view at source ↗
Figure 2
Figure 2. Figure 2: Visual comparison between the geometry of BW and its linearization LBW. Different [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A Brownian path in LBW space. If BT t is a Brownian particle in Symd , it may leave the cone of optimal transport maps, see (a) (equivalently, I + BT t leaves the cone of positive semi-definite matrices Sym+ d ). The extended exponential map (2.4), though, automatically reflects the dynamics so that Σt = expΣ0 (BT t ) ∈ Sym+ d without additional computational effort, see (b). To visualize a symmetric matri… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison between one run of the CBO and GF algorithms in approximating a [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Comparison between CBO and GF algorithms in approximating different target [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Sensitivity analysis of CBO performance with respect to (A) the diffusion parameter [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison between CBO and GF in approximating Gaussian mixture targets ( [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗
read the original abstract

Motivated by variational inference methods, we propose a zeroth-order algorithm for solving optimization problems in the space of Gaussian probability measures. The algorithm is based on an interacting system of Gaussian particles that stochastically explore the search space and self-organize around global minima via a consensus-based optimization (CBO) mechanism. Its construction relies on the Linearized Bures-Wasserstein (LBW) space, a novel parametrization of Gaussian measures we introduce for efficient computations. LBW is inspired by linearized optimal transport and preserves key geometric features while enabling computational tractability. We establish well-posedness and study the convergence properties of the particle dynamics via a mean-field approximation. Numerical experiments on variational inference tasks demonstrate the algorithm's robustness and superior performance with respect to deterministic gradient-based method in presence of low-dimensional non log-concave targets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a zeroth-order algorithm for optimization over Gaussian probability measures using an interacting system of Gaussian particles that self-organize via consensus-based optimization (CBO) in a novel Linearized Bures-Wasserstein (LBW) parametrization. Motivated by variational inference, the construction enables tractable computations while preserving key geometric features of the Bures-Wasserstein space. Well-posedness of the dynamics is established and convergence is studied via a mean-field limit approximation. Numerical experiments on low-dimensional non-log-concave targets claim robustness and superiority over deterministic gradient-based methods.

Significance. If the mean-field convergence analysis can be strengthened with explicit error controls, the work would provide a novel bridge between consensus-based optimization and linearized optimal transport geometries, offering a scalable particle method for non-convex variational inference problems where standard gradient approaches fail. The LBW space is a useful modeling choice that could influence future particle-based algorithms in probability measure spaces.

major comments (3)
  1. [Convergence analysis (mean-field limit)] The convergence analysis relies on the mean-field limit to justify long-time behavior of the finite-particle system, but provides no uniform-in-time error bounds or quantitative controls on the distance between the N-particle empirical measure and the mean-field PDE, especially for non-log-concave targets where the linearization may introduce spurious equilibria.
  2. [Numerical experiments] The superiority claim rests on numerical experiments for low-dimensional non-log-concave targets, yet the manuscript supplies no quantitative metrics (e.g., KL divergence values, convergence rates), experimental setup details (initializations, specific target distributions, dimension values), or ablation studies, preventing verification of the robustness assertion.
  3. [LBW parametrization definition and properties] The LBW parametrization is asserted to preserve key geometric features (including those relevant to consensus forces) while remaining tractable, but no explicit verification or counterexample analysis is given showing that geodesic convexity properties survive the linearization around a reference measure when the target is multimodal.
minor comments (2)
  1. [Abstract] The abstract contains the unhyphenated phrase 'non log-concave'; standardize to 'non-log-concave' for consistency with mathematical literature.
  2. [Notation and preliminaries] Notation for the LBW space, particle interactions, and mean-field limit should be introduced with a dedicated table or glossary to aid readability across sections.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We provide point-by-point responses below and will make revisions to address the concerns where possible.

read point-by-point responses
  1. Referee: [Convergence analysis (mean-field limit)] The convergence analysis relies on the mean-field limit to justify long-time behavior of the finite-particle system, but provides no uniform-in-time error bounds or quantitative controls on the distance between the N-particle empirical measure and the mean-field PDE, especially for non-log-concave targets where the linearization may introduce spurious equilibria.

    Authors: We appreciate this observation. The manuscript establishes well-posedness of the finite-particle dynamics and analyzes the mean-field limit to characterize the long-time behavior. However, we do not provide explicit uniform-in-time error bounds or quantitative controls on the approximation error for the empirical measure, particularly in the non-log-concave setting. In the revised version, we will include a dedicated remark discussing this limitation and its implications for the analysis, along with suggestions for future work on quantitative convergence rates. revision: yes

  2. Referee: [Numerical experiments] The superiority claim rests on numerical experiments for low-dimensional non-log-concave targets, yet the manuscript supplies no quantitative metrics (e.g., KL divergence values, convergence rates), experimental setup details (initializations, specific target distributions, dimension values), or ablation studies, preventing verification of the robustness assertion.

    Authors: We agree that additional details are necessary to substantiate the numerical claims. The revised manuscript will expand the numerical experiments section to include quantitative metrics such as KL divergence values and convergence rates, detailed experimental setups specifying initializations, target distributions, and dimension values, as well as ablation studies to demonstrate robustness. revision: yes

  3. Referee: [LBW parametrization definition and properties] The LBW parametrization is asserted to preserve key geometric features (including those relevant to consensus forces) while remaining tractable, but no explicit verification or counterexample analysis is given showing that geodesic convexity properties survive the linearization around a reference measure when the target is multimodal.

    Authors: The LBW parametrization is constructed via linearization in the Bures-Wasserstein geometry to retain essential properties for the consensus-based optimization, such as the structure of the consensus forces. To strengthen this, the revised manuscript will provide explicit verification of the preserved geometric features, including an analysis of how the linearization affects geodesic convexity for multimodal targets, supported by relevant derivations or illustrative examples. revision: yes

Circularity Check

0 steps flagged

No circularity: LBW parametrization and mean-field analysis are independent modeling choices built on external CBO and OT ideas

full rationale

The derivation introduces the Linearized Bures-Wasserstein parametrization as an explicit modeling choice inspired by linearized optimal transport, not obtained by fitting or redefinition from the target variational inference result. Well-posedness of the particle system and convergence properties are established through standard mean-field limit arguments applied to the CBO dynamics, which are not equivalent by construction to the finite-particle numerics or the claimed robustness on non-log-concave targets. Numerical experiments compare against gradient methods on separate low-dimensional tasks without reusing fitted parameters as predictions. No self-definitional reductions, fitted-input predictions, or load-bearing self-citations appear in the chain; the central claims rest on external consensus-based optimization literature and empirical validation rather than internal re-labeling.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the novel LBW parametrization being a faithful yet tractable proxy for Bures-Wasserstein geometry and on the validity of the mean-field limit for the interacting particle system.

axioms (1)
  • domain assumption The Linearized Bures-Wasserstein space preserves key geometric features of the Bures-Wasserstein geometry while enabling computational tractability.
    Invoked in the abstract as the foundation for the algorithm construction.
invented entities (1)
  • Linearized Bures-Wasserstein (LBW) space no independent evidence
    purpose: A novel parametrization of Gaussian measures that allows efficient computations while retaining essential geometric structure.
    Introduced by the authors as the core modeling device.

pith-pipeline@v0.9.0 · 5441 in / 1374 out tokens · 29145 ms · 2026-05-16T18:27:58.933011+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

  1. [1]

    SIAM Journal on Mathematical Analysis , author =

    M. Agueh and G. Carlier. Barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011. arXiv:https://doi.org/10.1137/ 100805741,doi:10.1137/100805741

  2. [2]

    Alvarez-Melis, Y

    D. Alvarez-Melis, Y. Schiff, and Y. Mroueh. Optimizing functionals on the space of probabilities with input convex neural networks.Transactions on Machine Learning Research, 2022. URL:https://openreview.net/forum?id=dpOYN7o8Jm

  3. [3]

    Ambrosio, N

    L. Ambrosio, N. Gigli, and G. Savar´ e.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Z¨ urich. Birkh¨ auser, 2. ed edition, 2008. OCLC: 254181287

  4. [4]

    Arasaratnam and S

    I. Arasaratnam and S. Haykin. Cubature Kalman filters.IEEE Transactions on Automatic Control, 54(6):1254–1269, 2009.doi:10.1109/TAC.2009.2019800

  5. [5]

    Bhatia, T

    R. Bhatia, T. Jain, and Y. Lim. On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165– 191, 2019. URL: https://www.sciencedirect.com/science/article/pii/ S0723086918300021,doi:10.1016/j.exmath.2018.01.002

  6. [6]

    A. N. Bishop and A. Doucet. Distributed nonlinear consensus in the space of probability measures.IFAC Proceedings Volumes, 47(3):8662–8668, 2014. 19th IFAC World Congress. URL: https://www.sciencedirect.com/science/article/pii/ S1474667016429800,doi:10.3182/20140824-6-ZA-1003.00341

  7. [7]

    A. N. Bishop and A. Doucet. Network consensus in the Wasserstein metric space of probability measures.SIAM Journal on Control and Optimization, 59(5):3261–3277, 2021.arXiv:https://doi.org/10.1137/19M1268252,doi:10.1137/19M1268252

  8. [8]

    D. M. Blei, A. Kucukelbir, and J. D. McAuliffe. Variational inference: A review for statisticians.Journal of the American Statistical Association, 112(518):859–877,

  9. [9]

    arXiv:https://doi.org/10.1080/01621459.2017.1285773, doi:10.1080/ 01621459.2017.1285773

  10. [10]

    Borghi and J

    G. Borghi and J. Carrillo. Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry, 2025. Media.doi:10.6084/m9.figshare.30958322

  11. [11]

    Borghi, M

    G. Borghi, M. Herty, and L. Pareschi. Constrained consensus-based optimization. SIAM Journal on Optimization, 33(1):211–236, 2023. arXiv:https://doi.org/10. 1137/22M1471304,doi:10.1137/22M1471304

  12. [12]

    Borghi, M

    G. Borghi, M. Herty, and L. Pareschi. Kinetic models for optimization: A unified mathematical framework for metaheuristics.arXiv preprint arXiv:2410.10369, 2024. URL:https://arxiv.org/abs/2410.10369,arXiv:2410.10369

  13. [13]

    Borghi, M

    G. Borghi, M. Herty, and A. Stavitskiy. Dynamics of measure-valued agents in the space of probabilities.SIAM Journal on Mathematical Analysis, 57(5):5107–5134, 2025.arXiv:https://doi.org/10.1137/24M1675515,doi:10.1137/24M1675515. 23

  14. [14]

    Y. Brenier. Polar factorization and monotone rearrangement of vector-valued func- tions.Communications on Pure and Applied Mathematics, 44(4):375–417, 1991. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpa.3160440402, arXiv: https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpa.3160440402, doi:10. 1002/cpa.3160440402

  15. [15]

    T. Cai, J. Cheng, N. Craig, and K. Craig. Linearized optimal transport for collider events.Phys. Rev. D, 102:116019, Dec 2020. URL: https://link.aps.org/doi/10. 1103/PhysRevD.102.116019,doi:10.1103/PhysRevD.102.116019

  16. [16]

    Carlier, A

    G. Carlier, A. Delalande, and Q. M´ erigot. Quantitative stability of the pushforward operation by an optimal transport map.Foundations of Computational Mathematics, 2024.doi:10.1007/s10208-024-09669-4

  17. [17]

    J. A. Carrillo, Y.-P. Choi, C. Totzeck, and O. Tse. An analytical framework for consensus-based global optimization method.Math. Models Methods Appl. Sci., 28(6):1037–1066, 2018.doi:10.1142/S0218202518500276

  18. [18]

    J. A. Carrillo, S. Jin, L. Li, and Y. Zhu. A consensus-based global optimization method for high dimensional machine learning problems.ESAIM: Control, Optimisation and Calculus of Variations, 27:S5, 2021

  19. [19]

    Chewi, T

    S. Chewi, T. Maunu, P. Rigollet, and A. Stromme. Gradient descent algorithms for Bures-Wasserstein barycenters. In J. Abernethy and S. Agarwal, editors,Pro- ceedings of Thirty Third Conference on Learning Theory, volume 125 ofProceedings of Machine Learning Research, pages 1276–1304. PMLR, 09–12 Jul 2020. URL: https://proceedings.mlr.press/v125/chewi20a.html

  20. [20]

    Cisneros-Velarde and F

    P. Cisneros-Velarde and F. Bullo. Distributed Wasserstein barycenters via displace- ment interpolation.IEEE Transactions on Control of Network Systems, 10(2):785–795, 2023.doi:10.1109/TCNS.2022.3210341

  21. [21]

    Delalande and Q

    A. Delalande and Q. Merigot. Quantitative stability of optimal transport maps under variations of the target measure.Duke Mathematical Journal, 172(17):3321–3357, 2023

  22. [22]

    Dembo and O

    A. Dembo and O. Zeitouni.Large Deviations Techniques and Applications. Springer- Verlag Berlin Heidelberg, 2010

  23. [23]

    M. Z. Diao, K. Balasubramanian, S. Chewi, and A. Salim. Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, p...

  24. [24]

    Journal of Multivariate Analysis , author =

    D. Dowson and B. Landau. The Fr´ echet distance between multivari- ate normal distributions.Journal of Multivariate Analysis, 12(3):450– 455, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0047259X8290077X,doi:10.1016/0047-259X(82)90077-X

  25. [25]

    Fornasier, T

    M. Fornasier, T. Klock, and K. Riedl. Consensus-based optimization methods converge globally.SIAM Journal on Optimization, 34(3):2973–3004, 2024. arXiv:https: //doi.org/10.1137/22M1527805,doi:10.1137/22M1527805

  26. [26]

    Gerber, F

    N. Gerber, F. Hoffmann, D. Kim, and U. Vaes. Uniform-in-time propagation of chaos for consensus-based optimization.arXiv preprint arXiv:2505.08669, 2025

  27. [27]

    M. B. Giles. Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In C. H. Bischof, H. M. B¨ ucker, P. Hovland, U. Naumann, 24 and J. Utke, editors,Advances in Automatic Differentiation, pages 35–44, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg

  28. [28]

    C. R. Givens and R. M. Shortt. A class of Wasserstein metrics for probability distributions.Michigan Mathematical Journal, 31(2):231–240, 1984

  29. [29]

    A. Han, B. Mishra, P. Jawanpuria, and J. Gao. On Riemannian optimization over positive definite matrices with the Bures–Wasserstein geometry. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, 2021. URL: https://openreview.net/forum?id=ZCHxGFmc62a

  30. [30]

    Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence

    J. Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michi- gan Press, 1975. URL:https://books.google.co.uk/books?id=JE5RAAAAMAAJ

  31. [31]

    Katsevich and P

    A. Katsevich and P. Rigollet. On the approximation accuracy of Gaussian variational inference.The Annals of Statistics, 52(4):1384–1409, 2024. doi:10.1214/24-AOS2393

  32. [32]

    Kennedy and R

    J. Kennedy and R. Eberhart. Particle swarm optimization. InProceedings of ICNN’95- international conference on neural networks, volume 4, pages 1942–1948. ieee, 1995

  33. [33]

    M. E. Khan and H. Rue. The Bayesian learning rule.J. Mach. Learn. Res., 24(1), Jan. 2023

  34. [34]

    doi:10.1126/science.220.4598.671 Jeffrey C

    S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated anneal- ing.Science, 220(4598):671–680, 1983. URL: https://www.science.org/doi/abs/ 10.1126/science.220.4598.671, arXiv:https://www.science.org/doi/pdf/10. 1126/science.220.4598.671,doi:10.1126/science.220.4598.671

  35. [35]

    Knoblauch, J

    J. Knoblauch, J. Jewson, and T. Damoulas. An optimization-centric view on Bayes’ rule: Reviewing and generalizing variational inference.Journal of Machine Learning Research, 23(132):1–109, 2022. URL: http://jmlr.org/papers/v23/19-1047.html

  36. [36]

    Kolouri, A

    S. Kolouri, A. B. Tosun, J. A. Ozolek, and G. K. Rohde. A continuous linear optimal transport approach for pattern analysis in image datasets.Pattern recognition, 51:453– 462, 2016

  37. [37]

    Lambert, S

    M. Lambert, S. Chewi, F. Bach, S. Bonnabel, and P. Rigollet. Variational inference via Wasserstein gradient flows. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors,Advances in Neural Information Processing Systems, 2022. URL: https: //openreview.net/forum?id=K2PTuvVTF1L

  38. [38]

    Liero, A

    M. Liero, A. Mielke, O. Tse, and J.-J. Zhu. Evolution of Gaussians in the Hellinger-Kantorovich-Boltzmann gradient flow.Communications on Pure and Ap- plied Analysis, (early access), 2025. URL: https://www.aimsciences.org/article/ id/68ecc16acb5dde21e7544b1b,doi:10.3934/cpaa.2025105

  39. [39]

    J. Lott. Some geometric calculations on Wasserstein space.Communications in Mathematical Physics, 277(2):423–437, 2008.doi:10.1007/s00220-007-0367-3

  40. [40]

    Malag` o, L

    L. Malag` o, L. Montrucchio, and G. Pistone. Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1(2):137–179, Dec 2018. doi:10.1007/ s41884-018-0014-4

  41. [41]

    Moosm¨ uller and A

    C. Moosm¨ uller and A. Cloninger. Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations.Infor- mation and Inference: A Journal of the IMA, 12(1):363–389, 2023

  42. [42]

    Olkin and F

    I. Olkin and F. Pukelsheim. The distance between two random vec- tors with given dispersion matrices.Linear Algebra and its Applications, 48:257–263, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0024379582901124,doi:10.1016/0024-3795(82)90112-4. 25

  43. [43]

    F. Otto. The geometry of dissipative evolution equations: The porous medium equation.Communications in Partial Differential Equations, 26(1-2):101–174, 2001. doi:10.1081/PDE-100002243

  44. [44]

    Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality

    F. Otto and C. Villani. Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality.Journal of Functional Analysis, 173(2):361–400, 2000. URL: https://www.sciencedirect.com/science/article/ pii/S0022123699935577,doi:10.1006/jfan.1999.3557

  45. [45]

    Pennec, P

    X. Pennec, P. Fillard, and N. Ayache. A Riemannian framework for tensor computing.International Journal of Computer Vision, 66(1):41–66, 2006. doi: 10.1007/s11263-005-3222-z

  46. [46]

    Pettersson

    R. Pettersson. Projection scheme for stochastic differential equations with convex constraints.Stochastic Processes and their Applications, 88(1):125– 134, 2000. URL: https://www.sciencedirect.com/science/article/pii/ S0304414999001210,doi:10.1016/S0304-4149(99)00121-0

  47. [47]

    Pilipenko.An introduction to stochastic differential equations with reflection

    A. Pilipenko.An introduction to stochastic differential equations with reflection. Universit¨ atsverlag Potsdam, 2014

  48. [48]

    Pinnau and C

    R. Pinnau, C. Totzeck, O. Tse, and S. Martin. A consensus-based model for global optimization and its mean-field limit.Math. Models Methods Appl. Sci., 27(1):183–204, 2017.doi:10.1142/S0218202517400061

  49. [49]

    Polyanskiy and Y

    Y. Polyanskiy and Y. Wu. Wasserstein continuity of entropy and outer bounds for interference channels.IEEE Transactions on Information Theory, 62(7):3992–4002, 2016

  50. [50]

    Ren and F.-Y

    P. Ren and F.-Y. Wang. Ornstein–Uhlenbeck type processes on Wasserstein spaces. Stochastic Processes and their Applications, 172:104339, 2024. doi:10.1016/j.spa. 2024.104339

  51. [51]

    R¨ uschendorf and L

    L. R¨ uschendorf and L. Uckelmann. On the n-coupling problem.Journal of Multivariate Analysis, 81(2):242–258, 2002. URL: https://www.sciencedirect.com/science/ article/pii/S0047259X01920056,doi:10.1006/jmva.2001.2005

  52. [52]

    Santambrogio.Optimal Transport for Applied Mathematicians

    F. Santambrogio.Optimal Transport for Applied Mathematicians. Birkh¨ auser, 2015

  53. [53]

    Sarrazin and B

    C. Sarrazin and B. Schmitzer. Linearized optimal transport on manifolds.SIAM Journal on Mathematical Analysis, 56(4):4970–5016, 2024. arXiv:https://doi.org/ 10.1137/23M1564535,doi:10.1137/23M1564535

  54. [54]

    SIAM Rev.58(3), 377–441 (2016)

    V. Simoncini. Computational methods for linear matrix equations.SIAM Review, 58(3):377–441, 2016. arXiv:https://doi.org/10.1137/130912839, doi:10.1137/ 130912839

  55. [55]

    Spokoiny and M

    V. Spokoiny and M. Panov. Accuracy of Gaussian approximation for high-dimensional posterior distributions.Bernoulli, 31(2):843 – 867, 2025. doi:10.3150/21-BEJ1412

  56. [56]

    Sznitman

    A.-S. Sznitman. Topics in propagation of chaos. InEcole d’´ et´ e de probabilit´ es de Saint-Flour XIX—1989, pages 165–251. Springer, 1991

  57. [57]

    A. Takatsu. Wasserstein geometry of Gaussian measures.Osaka Journal of Mathe- matics, 48(4):1005 – 1026, 2011

  58. [58]

    Thanwerdas and X

    Y. Thanwerdas and X. Pennec. Bures–Wasserstein minimizing geodesics between covariance matrices of different ranks.SIAM Journal on Matrix Analysis and Ap- plications, 44(3):1447–1476, 2023. arXiv:https://doi.org/10.1137/22M149168X, doi:10.1137/22M149168X. 26

  59. [59]

    , journal =

    L. Tierney and J. B. Kadane. Accurate approximations for posterior mo- ments and marginal densities.Journal of the American Statistical Associa- tion, 81(393):82–86, 1986. URL: https://www.tandfonline.com/doi/abs/10. 1080/01621459.1986.10478240, arXiv:https://www.tandfonline.com/doi/pdf/ 10.1080/01621459.1986.10478240,doi:10.1080/01621459.1986.10478240

  60. [60]

    A. Uhlmann. The metric of Bures and the geometric phase.Quantum groups and related topics, pages 267–264, 1992

  61. [61]

    Vayer and R

    T. Vayer and R. Gribonval. Controlling Wasserstein distances by kernel norms with application to compressive statistical learning.Journal of Machine Learning Research, 24(149):1–51, 2023. URL:http://jmlr.org/papers/v24/21-1516.html

  62. [62]

    Bernoulli , author =

    Y. Zemel and V. M. Panaretos. Fr´ echet means and Procrustes analysis in Wasserstein space.Bernoulli, 25(2):932 – 976, 2019.doi:10.3150/17-BEJ1009

  63. [63]

    P. C. ´Alvarez Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matr´ an. A fixed-point approach to barycenters in Wasserstein space.Journal of Mathematical Analysis and Applications, 441(2):744–762, 2016. URL: https://www.sciencedirect.com/ science/article/pii/S0022247X16300907,doi:10.1016/j.jmaa.2016.04.045. 27