Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry

Giacomo Borghi; Jos\'e A. Carrillo

arxiv: 2601.00632 · v2 · pith:RS5AJSYInew · submitted 2026-01-02 · 🧮 math.OC

Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry

Giacomo Borghi , Jos\'e A. Carrillo This is my paper

Pith reviewed 2026-05-16 18:27 UTC · model grok-4.3

classification 🧮 math.OC

keywords variational inferenceBures-Wasserstein geometryinteracting particlesconsensus-based optimizationGaussian measureszeroth-order methodsmean-field limit

0 comments

The pith

Interacting Gaussian particles optimize variational inference in the linearized Bures-Wasserstein space.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a zeroth-order algorithm that uses a system of interacting Gaussian particles to solve optimization problems over Gaussian measures for variational inference. These particles stochastically explore the space and reach consensus around global minima inside the new Linearized Bures-Wasserstein parametrization, which keeps essential geometric properties while remaining computationally feasible. The approach matters because it avoids gradient computations and shows stronger robustness than deterministic methods when the target distributions are low-dimensional and non-log-concave, according to the reported experiments. The authors prove the particle dynamics are well-posed and examine their convergence through a mean-field limit.

Core claim

The authors introduce the Linearized Bures-Wasserstein space as a tractable parametrization of Gaussian measures and build an interacting particle system that performs consensus-based optimization to locate global minima. They establish well-posedness of the stochastic dynamics and study their convergence properties via a mean-field approximation.

What carries the argument

The Linearized Bures-Wasserstein (LBW) parametrization of Gaussian measures, which enables efficient computations while retaining key geometric features from optimal transport to support the interacting particle dynamics.

If this is right

The algorithm converges to global optima in the space of Gaussian measures.
Numerical tests show better robustness and performance than deterministic gradient methods on non-log-concave targets.
The mean-field limit captures the long-time dynamics of the finite-particle system.
Well-posedness of the stochastic particle dynamics holds.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar linearizations could extend the method to mixtures of Gaussians or other measure classes.
The stochastic consensus mechanism may help with multimodal posteriors common in Bayesian settings.
Direct comparisons to other particle-based variational inference methods would clarify relative strengths.

Load-bearing premise

The linearized Bures-Wasserstein parametrization preserves enough of the original geometry for the consensus mechanism to reach global minima, and the mean-field limit accurately describes the long-time behavior of the finite-particle system.

What would settle it

Numerical runs on the same low-dimensional non-log-concave targets where the particle system fails to converge to the reported global minima or performs no better than gradient methods would falsify the performance advantage.

Figures

Figures reproduced from arXiv: 2601.00632 by Giacomo Borghi, Jos\'e A. Carrillo.

**Figure 1.** Figure 1: Evolution of N = 10 Gaussian particles for minimization of E = KL divergence from target bi-modal measure (contour lines). Particles evolve according to the CBO-type dynamics we design in this paper for problems of type (1.2) (see Section 4 for the definition). Final plot compares the solution computed by the CBO algorithm with the one of BW Gradient Flow algorithm [36]. Corresponding KL values are also sh… view at source ↗

**Figure 2.** Figure 2: Visual comparison between the geometry of BW and its linearization LBW. Different [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: A Brownian path in LBW space. If BT t is a Brownian particle in Symd , it may leave the cone of optimal transport maps, see (a) (equivalently, I + BT t leaves the cone of positive semi-definite matrices Sym+ d ). The extended exponential map (2.4), though, automatically reflects the dynamics so that Σt = expΣ0 (BT t ) ∈ Sym+ d without additional computational effort, see (b). To visualize a symmetric matri… view at source ↗

**Figure 4.** Figure 4: Comparison between one run of the CBO and GF algorithms in approximating a [PITH_FULL_IMAGE:figures/full_fig_p018_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison between CBO and GF algorithms in approximating different target [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗

**Figure 6.** Figure 6: Sensitivity analysis of CBO performance with respect to (A) the diffusion parameter [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗

**Figure 7.** Figure 7: Comparison between CBO and GF in approximating Gaussian mixture targets ( [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

read the original abstract

Motivated by variational inference methods, we propose a zeroth-order algorithm for solving optimization problems in the space of Gaussian probability measures. The algorithm is based on an interacting system of Gaussian particles that stochastically explore the search space and self-organize around global minima via a consensus-based optimization (CBO) mechanism. Its construction relies on the Linearized Bures-Wasserstein (LBW) space, a novel parametrization of Gaussian measures we introduce for efficient computations. LBW is inspired by linearized optimal transport and preserves key geometric features while enabling computational tractability. We establish well-posedness and study the convergence properties of the particle dynamics via a mean-field approximation. Numerical experiments on variational inference tasks demonstrate the algorithm's robustness and superior performance with respect to deterministic gradient-based method in presence of low-dimensional non log-concave targets.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces a linearized Bures-Wasserstein parametrization for Gaussians and runs consensus-based particle dynamics on it for variational inference, with some evidence of robustness on low-dimensional non-log-concave targets.

read the letter

The key takeaway is that this paper gives a workable particle method for optimizing over Gaussians using a new linearized Bures-Wasserstein space, and the low-dimensional experiments suggest it handles non-log-concave targets better than gradient descent. They introduce the LBW parametrization to make computations tractable while keeping enough of the geometry for the consensus forces to work. The algorithm lets Gaussian particles interact stochastically and self-organize, and they prove well-posedness plus convergence in the mean-field limit. That combination of a new modeling choice with existing CBO ideas is what feels fresh. The numerics are the part that shows promise. On variational inference tasks with low-dimensional multimodal targets, the method appears more robust than deterministic alternatives. This aligns with the motivation from VI, where full gradients can be expensive or unavailable. The main concern is how much the linearization affects the results. Since LBW is an approximation around a reference, it might lose some convexity properties in non-convex settings, and the paper's mean-field analysis may not fully bridge to the finite-particle system at long times. The abstract does not provide quantitative metrics or scaling tests, so the superiority claim rests on setups that need verification. These are not deal-breakers but they limit how far one can trust the claims without the full details. Overall this is for researchers in optimal transport and variational methods who want zeroth-order options. It has enough technical substance to merit peer review, even if revisions will be needed on the error analysis and experiments. I would recommend sending it out.

Referee Report

3 major / 2 minor

Summary. The paper proposes a zeroth-order algorithm for optimization over Gaussian probability measures using an interacting system of Gaussian particles that self-organize via consensus-based optimization (CBO) in a novel Linearized Bures-Wasserstein (LBW) parametrization. Motivated by variational inference, the construction enables tractable computations while preserving key geometric features of the Bures-Wasserstein space. Well-posedness of the dynamics is established and convergence is studied via a mean-field limit approximation. Numerical experiments on low-dimensional non-log-concave targets claim robustness and superiority over deterministic gradient-based methods.

Significance. If the mean-field convergence analysis can be strengthened with explicit error controls, the work would provide a novel bridge between consensus-based optimization and linearized optimal transport geometries, offering a scalable particle method for non-convex variational inference problems where standard gradient approaches fail. The LBW space is a useful modeling choice that could influence future particle-based algorithms in probability measure spaces.

major comments (3)

[Convergence analysis (mean-field limit)] The convergence analysis relies on the mean-field limit to justify long-time behavior of the finite-particle system, but provides no uniform-in-time error bounds or quantitative controls on the distance between the N-particle empirical measure and the mean-field PDE, especially for non-log-concave targets where the linearization may introduce spurious equilibria.
[Numerical experiments] The superiority claim rests on numerical experiments for low-dimensional non-log-concave targets, yet the manuscript supplies no quantitative metrics (e.g., KL divergence values, convergence rates), experimental setup details (initializations, specific target distributions, dimension values), or ablation studies, preventing verification of the robustness assertion.
[LBW parametrization definition and properties] The LBW parametrization is asserted to preserve key geometric features (including those relevant to consensus forces) while remaining tractable, but no explicit verification or counterexample analysis is given showing that geodesic convexity properties survive the linearization around a reference measure when the target is multimodal.

minor comments (2)

[Abstract] The abstract contains the unhyphenated phrase 'non log-concave'; standardize to 'non-log-concave' for consistency with mathematical literature.
[Notation and preliminaries] Notation for the LBW space, particle interactions, and mean-field limit should be introduced with a dedicated table or glossary to aid readability across sections.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We provide point-by-point responses below and will make revisions to address the concerns where possible.

read point-by-point responses

Referee: [Convergence analysis (mean-field limit)] The convergence analysis relies on the mean-field limit to justify long-time behavior of the finite-particle system, but provides no uniform-in-time error bounds or quantitative controls on the distance between the N-particle empirical measure and the mean-field PDE, especially for non-log-concave targets where the linearization may introduce spurious equilibria.

Authors: We appreciate this observation. The manuscript establishes well-posedness of the finite-particle dynamics and analyzes the mean-field limit to characterize the long-time behavior. However, we do not provide explicit uniform-in-time error bounds or quantitative controls on the approximation error for the empirical measure, particularly in the non-log-concave setting. In the revised version, we will include a dedicated remark discussing this limitation and its implications for the analysis, along with suggestions for future work on quantitative convergence rates. revision: yes
Referee: [Numerical experiments] The superiority claim rests on numerical experiments for low-dimensional non-log-concave targets, yet the manuscript supplies no quantitative metrics (e.g., KL divergence values, convergence rates), experimental setup details (initializations, specific target distributions, dimension values), or ablation studies, preventing verification of the robustness assertion.

Authors: We agree that additional details are necessary to substantiate the numerical claims. The revised manuscript will expand the numerical experiments section to include quantitative metrics such as KL divergence values and convergence rates, detailed experimental setups specifying initializations, target distributions, and dimension values, as well as ablation studies to demonstrate robustness. revision: yes
Referee: [LBW parametrization definition and properties] The LBW parametrization is asserted to preserve key geometric features (including those relevant to consensus forces) while remaining tractable, but no explicit verification or counterexample analysis is given showing that geodesic convexity properties survive the linearization around a reference measure when the target is multimodal.

Authors: The LBW parametrization is constructed via linearization in the Bures-Wasserstein geometry to retain essential properties for the consensus-based optimization, such as the structure of the consensus forces. To strengthen this, the revised manuscript will provide explicit verification of the preserved geometric features, including an analysis of how the linearization affects geodesic convexity for multimodal targets, supported by relevant derivations or illustrative examples. revision: yes

Circularity Check

0 steps flagged

No circularity: LBW parametrization and mean-field analysis are independent modeling choices built on external CBO and OT ideas

full rationale

The derivation introduces the Linearized Bures-Wasserstein parametrization as an explicit modeling choice inspired by linearized optimal transport, not obtained by fitting or redefinition from the target variational inference result. Well-posedness of the particle system and convergence properties are established through standard mean-field limit arguments applied to the CBO dynamics, which are not equivalent by construction to the finite-particle numerics or the claimed robustness on non-log-concave targets. Numerical experiments compare against gradient methods on separate low-dimensional tasks without reusing fitted parameters as predictions. No self-definitional reductions, fitted-input predictions, or load-bearing self-citations appear in the chain; the central claims rest on external consensus-based optimization literature and empirical validation rather than internal re-labeling.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the novel LBW parametrization being a faithful yet tractable proxy for Bures-Wasserstein geometry and on the validity of the mean-field limit for the interacting particle system.

axioms (1)

domain assumption The Linearized Bures-Wasserstein space preserves key geometric features of the Bures-Wasserstein geometry while enabling computational tractability.
Invoked in the abstract as the foundation for the algorithm construction.

invented entities (1)

Linearized Bures-Wasserstein (LBW) space no independent evidence
purpose: A novel parametrization of Gaussian measures that allows efficient computations while retaining essential geometric structure.
Introduced by the authors as the core modeling device.

pith-pipeline@v0.9.0 · 5441 in / 1374 out tokens · 29145 ms · 2026-05-16T18:27:58.933011+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose the Linearized Bures–Wasserstein space (LBW)... consensus point... weighted LBW barycenters... mean-field approximation
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Numerical experiments... superior performance... non log-concave targets

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages

[1]

SIAM Journal on Mathematical Analysis , author =

M. Agueh and G. Carlier. Barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011. arXiv:https://doi.org/10.1137/ 100805741,doi:10.1137/100805741

work page doi:10.1137/100805741 2011
[2]

Alvarez-Melis, Y

D. Alvarez-Melis, Y. Schiff, and Y. Mroueh. Optimizing functionals on the space of probabilities with input convex neural networks.Transactions on Machine Learning Research, 2022. URL:https://openreview.net/forum?id=dpOYN7o8Jm

work page 2022
[3]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savar´ e.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Z¨ urich. Birkh¨ auser, 2. ed edition, 2008. OCLC: 254181287

work page 2008
[4]

Arasaratnam and S

I. Arasaratnam and S. Haykin. Cubature Kalman filters.IEEE Transactions on Automatic Control, 54(6):1254–1269, 2009.doi:10.1109/TAC.2009.2019800

work page doi:10.1109/tac.2009.2019800 2009
[5]

Bhatia, T

R. Bhatia, T. Jain, and Y. Lim. On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165– 191, 2019. URL: https://www.sciencedirect.com/science/article/pii/ S0723086918300021,doi:10.1016/j.exmath.2018.01.002

work page doi:10.1016/j.exmath.2018.01.002 2019
[6]

A. N. Bishop and A. Doucet. Distributed nonlinear consensus in the space of probability measures.IFAC Proceedings Volumes, 47(3):8662–8668, 2014. 19th IFAC World Congress. URL: https://www.sciencedirect.com/science/article/pii/ S1474667016429800,doi:10.3182/20140824-6-ZA-1003.00341

work page doi:10.3182/20140824-6-za-1003.00341 2014
[7]

A. N. Bishop and A. Doucet. Network consensus in the Wasserstein metric space of probability measures.SIAM Journal on Control and Optimization, 59(5):3261–3277, 2021.arXiv:https://doi.org/10.1137/19M1268252,doi:10.1137/19M1268252

work page doi:10.1137/19m1268252 2021
[8]

D. M. Blei, A. Kucukelbir, and J. D. McAuliffe. Variational inference: A review for statisticians.Journal of the American Statistical Association, 112(518):859–877,

work page
[9]

arXiv:https://doi.org/10.1080/01621459.2017.1285773, doi:10.1080/ 01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017
[10]

Borghi and J

G. Borghi and J. Carrillo. Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry, 2025. Media.doi:10.6084/m9.figshare.30958322

work page doi:10.6084/m9.figshare.30958322 2025
[11]

Borghi, M

G. Borghi, M. Herty, and L. Pareschi. Constrained consensus-based optimization. SIAM Journal on Optimization, 33(1):211–236, 2023. arXiv:https://doi.org/10. 1137/22M1471304,doi:10.1137/22M1471304

work page doi:10.1137/22m1471304 2023
[12]

Borghi, M

G. Borghi, M. Herty, and L. Pareschi. Kinetic models for optimization: A unified mathematical framework for metaheuristics.arXiv preprint arXiv:2410.10369, 2024. URL:https://arxiv.org/abs/2410.10369,arXiv:2410.10369

work page arXiv 2024
[13]

Borghi, M

G. Borghi, M. Herty, and A. Stavitskiy. Dynamics of measure-valued agents in the space of probabilities.SIAM Journal on Mathematical Analysis, 57(5):5107–5134, 2025.arXiv:https://doi.org/10.1137/24M1675515,doi:10.1137/24M1675515. 23

work page doi:10.1137/24m1675515 2025
[14]

Y. Brenier. Polar factorization and monotone rearrangement of vector-valued func- tions.Communications on Pure and Applied Mathematics, 44(4):375–417, 1991. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpa.3160440402, arXiv: https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpa.3160440402, doi:10. 1002/cpa.3160440402

work page doi:10.1002/cpa.3160440402 1991
[15]

T. Cai, J. Cheng, N. Craig, and K. Craig. Linearized optimal transport for collider events.Phys. Rev. D, 102:116019, Dec 2020. URL: https://link.aps.org/doi/10. 1103/PhysRevD.102.116019,doi:10.1103/PhysRevD.102.116019

work page doi:10.1103/physrevd.102.116019 2020
[16]

Carlier, A

G. Carlier, A. Delalande, and Q. M´ erigot. Quantitative stability of the pushforward operation by an optimal transport map.Foundations of Computational Mathematics, 2024.doi:10.1007/s10208-024-09669-4

work page doi:10.1007/s10208-024-09669-4 2024
[17]

J. A. Carrillo, Y.-P. Choi, C. Totzeck, and O. Tse. An analytical framework for consensus-based global optimization method.Math. Models Methods Appl. Sci., 28(6):1037–1066, 2018.doi:10.1142/S0218202518500276

work page doi:10.1142/s0218202518500276 2018
[18]

J. A. Carrillo, S. Jin, L. Li, and Y. Zhu. A consensus-based global optimization method for high dimensional machine learning problems.ESAIM: Control, Optimisation and Calculus of Variations, 27:S5, 2021

work page 2021
[19]

Chewi, T

S. Chewi, T. Maunu, P. Rigollet, and A. Stromme. Gradient descent algorithms for Bures-Wasserstein barycenters. In J. Abernethy and S. Agarwal, editors,Pro- ceedings of Thirty Third Conference on Learning Theory, volume 125 ofProceedings of Machine Learning Research, pages 1276–1304. PMLR, 09–12 Jul 2020. URL: https://proceedings.mlr.press/v125/chewi20a.html

work page 2020
[20]

Cisneros-Velarde and F

P. Cisneros-Velarde and F. Bullo. Distributed Wasserstein barycenters via displace- ment interpolation.IEEE Transactions on Control of Network Systems, 10(2):785–795, 2023.doi:10.1109/TCNS.2022.3210341

work page doi:10.1109/tcns.2022.3210341 2023
[21]

Delalande and Q

A. Delalande and Q. Merigot. Quantitative stability of optimal transport maps under variations of the target measure.Duke Mathematical Journal, 172(17):3321–3357, 2023

work page 2023
[22]

Dembo and O

A. Dembo and O. Zeitouni.Large Deviations Techniques and Applications. Springer- Verlag Berlin Heidelberg, 2010

work page 2010
[23]

M. Z. Diao, K. Balasubramanian, S. Chewi, and A. Salim. Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, p...

work page 2023
[24]

Journal of Multivariate Analysis , author =

D. Dowson and B. Landau. The Fr´ echet distance between multivari- ate normal distributions.Journal of Multivariate Analysis, 12(3):450– 455, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0047259X8290077X,doi:10.1016/0047-259X(82)90077-X

work page doi:10.1016/0047-259x(82)90077-x 1982
[25]

Fornasier, T

M. Fornasier, T. Klock, and K. Riedl. Consensus-based optimization methods converge globally.SIAM Journal on Optimization, 34(3):2973–3004, 2024. arXiv:https: //doi.org/10.1137/22M1527805,doi:10.1137/22M1527805

work page doi:10.1137/22m1527805 2024
[26]

Gerber, F

N. Gerber, F. Hoffmann, D. Kim, and U. Vaes. Uniform-in-time propagation of chaos for consensus-based optimization.arXiv preprint arXiv:2505.08669, 2025

work page arXiv 2025
[27]

M. B. Giles. Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In C. H. Bischof, H. M. B¨ ucker, P. Hovland, U. Naumann, 24 and J. Utke, editors,Advances in Automatic Differentiation, pages 35–44, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg

work page 2008
[28]

C. R. Givens and R. M. Shortt. A class of Wasserstein metrics for probability distributions.Michigan Mathematical Journal, 31(2):231–240, 1984

work page 1984
[29]

A. Han, B. Mishra, P. Jawanpuria, and J. Gao. On Riemannian optimization over positive definite matrices with the Bures–Wasserstein geometry. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, 2021. URL: https://openreview.net/forum?id=ZCHxGFmc62a

work page 2021
[30]

Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence

J. Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michi- gan Press, 1975. URL:https://books.google.co.uk/books?id=JE5RAAAAMAAJ

work page 1975
[31]

Katsevich and P

A. Katsevich and P. Rigollet. On the approximation accuracy of Gaussian variational inference.The Annals of Statistics, 52(4):1384–1409, 2024. doi:10.1214/24-AOS2393

work page doi:10.1214/24-aos2393 2024
[32]

Kennedy and R

J. Kennedy and R. Eberhart. Particle swarm optimization. InProceedings of ICNN’95- international conference on neural networks, volume 4, pages 1942–1948. ieee, 1995

work page 1942
[33]

M. E. Khan and H. Rue. The Bayesian learning rule.J. Mach. Learn. Res., 24(1), Jan. 2023

work page 2023
[34]

doi:10.1126/science.220.4598.671 Jeffrey C

S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated anneal- ing.Science, 220(4598):671–680, 1983. URL: https://www.science.org/doi/abs/ 10.1126/science.220.4598.671, arXiv:https://www.science.org/doi/pdf/10. 1126/science.220.4598.671,doi:10.1126/science.220.4598.671

work page doi:10.1126/science.220.4598.671 1983
[35]

Knoblauch, J

J. Knoblauch, J. Jewson, and T. Damoulas. An optimization-centric view on Bayes’ rule: Reviewing and generalizing variational inference.Journal of Machine Learning Research, 23(132):1–109, 2022. URL: http://jmlr.org/papers/v23/19-1047.html

work page 2022
[36]

Kolouri, A

S. Kolouri, A. B. Tosun, J. A. Ozolek, and G. K. Rohde. A continuous linear optimal transport approach for pattern analysis in image datasets.Pattern recognition, 51:453– 462, 2016

work page 2016
[37]

Lambert, S

M. Lambert, S. Chewi, F. Bach, S. Bonnabel, and P. Rigollet. Variational inference via Wasserstein gradient flows. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors,Advances in Neural Information Processing Systems, 2022. URL: https: //openreview.net/forum?id=K2PTuvVTF1L

work page 2022
[38]

Liero, A

M. Liero, A. Mielke, O. Tse, and J.-J. Zhu. Evolution of Gaussians in the Hellinger-Kantorovich-Boltzmann gradient flow.Communications on Pure and Ap- plied Analysis, (early access), 2025. URL: https://www.aimsciences.org/article/ id/68ecc16acb5dde21e7544b1b,doi:10.3934/cpaa.2025105

work page doi:10.3934/cpaa.2025105 2025
[39]

J. Lott. Some geometric calculations on Wasserstein space.Communications in Mathematical Physics, 277(2):423–437, 2008.doi:10.1007/s00220-007-0367-3

work page doi:10.1007/s00220-007-0367-3 2008
[40]

Malag` o, L

L. Malag` o, L. Montrucchio, and G. Pistone. Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1(2):137–179, Dec 2018. doi:10.1007/ s41884-018-0014-4

work page 2018
[41]

Moosm¨ uller and A

C. Moosm¨ uller and A. Cloninger. Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations.Infor- mation and Inference: A Journal of the IMA, 12(1):363–389, 2023

work page 2023
[42]

Olkin and F

I. Olkin and F. Pukelsheim. The distance between two random vec- tors with given dispersion matrices.Linear Algebra and its Applications, 48:257–263, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0024379582901124,doi:10.1016/0024-3795(82)90112-4. 25

work page doi:10.1016/0024-3795(82)90112-4 1982
[43]

F. Otto. The geometry of dissipative evolution equations: The porous medium equation.Communications in Partial Differential Equations, 26(1-2):101–174, 2001. doi:10.1081/PDE-100002243

work page doi:10.1081/pde-100002243 2001
[44]

Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality

F. Otto and C. Villani. Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality.Journal of Functional Analysis, 173(2):361–400, 2000. URL: https://www.sciencedirect.com/science/article/ pii/S0022123699935577,doi:10.1006/jfan.1999.3557

work page doi:10.1006/jfan.1999.3557 2000
[45]

Pennec, P

X. Pennec, P. Fillard, and N. Ayache. A Riemannian framework for tensor computing.International Journal of Computer Vision, 66(1):41–66, 2006. doi: 10.1007/s11263-005-3222-z

work page doi:10.1007/s11263-005-3222-z 2006
[46]

Pettersson

R. Pettersson. Projection scheme for stochastic differential equations with convex constraints.Stochastic Processes and their Applications, 88(1):125– 134, 2000. URL: https://www.sciencedirect.com/science/article/pii/ S0304414999001210,doi:10.1016/S0304-4149(99)00121-0

work page doi:10.1016/s0304-4149(99)00121-0 2000
[47]

Pilipenko.An introduction to stochastic differential equations with reflection

A. Pilipenko.An introduction to stochastic differential equations with reflection. Universit¨ atsverlag Potsdam, 2014

work page 2014
[48]

Pinnau and C

R. Pinnau, C. Totzeck, O. Tse, and S. Martin. A consensus-based model for global optimization and its mean-field limit.Math. Models Methods Appl. Sci., 27(1):183–204, 2017.doi:10.1142/S0218202517400061

work page doi:10.1142/s0218202517400061 2017
[49]

Polyanskiy and Y

Y. Polyanskiy and Y. Wu. Wasserstein continuity of entropy and outer bounds for interference channels.IEEE Transactions on Information Theory, 62(7):3992–4002, 2016

work page 2016
[50]

Ren and F.-Y

P. Ren and F.-Y. Wang. Ornstein–Uhlenbeck type processes on Wasserstein spaces. Stochastic Processes and their Applications, 172:104339, 2024. doi:10.1016/j.spa. 2024.104339

work page doi:10.1016/j.spa 2024
[51]

R¨ uschendorf and L

L. R¨ uschendorf and L. Uckelmann. On the n-coupling problem.Journal of Multivariate Analysis, 81(2):242–258, 2002. URL: https://www.sciencedirect.com/science/ article/pii/S0047259X01920056,doi:10.1006/jmva.2001.2005

work page doi:10.1006/jmva.2001.2005 2002
[52]

Santambrogio.Optimal Transport for Applied Mathematicians

F. Santambrogio.Optimal Transport for Applied Mathematicians. Birkh¨ auser, 2015

work page 2015
[53]

Sarrazin and B

C. Sarrazin and B. Schmitzer. Linearized optimal transport on manifolds.SIAM Journal on Mathematical Analysis, 56(4):4970–5016, 2024. arXiv:https://doi.org/ 10.1137/23M1564535,doi:10.1137/23M1564535

work page doi:10.1137/23m1564535 2024
[54]

SIAM Rev.58(3), 377–441 (2016)

V. Simoncini. Computational methods for linear matrix equations.SIAM Review, 58(3):377–441, 2016. arXiv:https://doi.org/10.1137/130912839, doi:10.1137/ 130912839

work page doi:10.1137/130912839 2016
[55]

Spokoiny and M

V. Spokoiny and M. Panov. Accuracy of Gaussian approximation for high-dimensional posterior distributions.Bernoulli, 31(2):843 – 867, 2025. doi:10.3150/21-BEJ1412

work page doi:10.3150/21-bej1412 2025
[56]

Sznitman

A.-S. Sznitman. Topics in propagation of chaos. InEcole d’´ et´ e de probabilit´ es de Saint-Flour XIX—1989, pages 165–251. Springer, 1991

work page 1989
[57]

A. Takatsu. Wasserstein geometry of Gaussian measures.Osaka Journal of Mathe- matics, 48(4):1005 – 1026, 2011

work page 2011
[58]

Thanwerdas and X

Y. Thanwerdas and X. Pennec. Bures–Wasserstein minimizing geodesics between covariance matrices of different ranks.SIAM Journal on Matrix Analysis and Ap- plications, 44(3):1447–1476, 2023. arXiv:https://doi.org/10.1137/22M149168X, doi:10.1137/22M149168X. 26

work page doi:10.1137/22m149168x 2023
[59]

, journal =

L. Tierney and J. B. Kadane. Accurate approximations for posterior mo- ments and marginal densities.Journal of the American Statistical Associa- tion, 81(393):82–86, 1986. URL: https://www.tandfonline.com/doi/abs/10. 1080/01621459.1986.10478240, arXiv:https://www.tandfonline.com/doi/pdf/ 10.1080/01621459.1986.10478240,doi:10.1080/01621459.1986.10478240

work page doi:10.1080/01621459.1986.10478240 1986
[60]

A. Uhlmann. The metric of Bures and the geometric phase.Quantum groups and related topics, pages 267–264, 1992

work page 1992
[61]

Vayer and R

T. Vayer and R. Gribonval. Controlling Wasserstein distances by kernel norms with application to compressive statistical learning.Journal of Machine Learning Research, 24(149):1–51, 2023. URL:http://jmlr.org/papers/v24/21-1516.html

work page 2023
[62]

Bernoulli , author =

Y. Zemel and V. M. Panaretos. Fr´ echet means and Procrustes analysis in Wasserstein space.Bernoulli, 25(2):932 – 976, 2019.doi:10.3150/17-BEJ1009

work page doi:10.3150/17-bej1009 2019
[63]

P. C. ´Alvarez Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matr´ an. A fixed-point approach to barycenters in Wasserstein space.Journal of Mathematical Analysis and Applications, 441(2):744–762, 2016. URL: https://www.sciencedirect.com/ science/article/pii/S0022247X16300907,doi:10.1016/j.jmaa.2016.04.045. 27

work page doi:10.1016/j.jmaa.2016.04.045 2016

[1] [1]

SIAM Journal on Mathematical Analysis , author =

M. Agueh and G. Carlier. Barycenters in the Wasserstein space.SIAM Journal on Mathematical Analysis, 43(2):904–924, 2011. arXiv:https://doi.org/10.1137/ 100805741,doi:10.1137/100805741

work page doi:10.1137/100805741 2011

[2] [2]

Alvarez-Melis, Y

D. Alvarez-Melis, Y. Schiff, and Y. Mroueh. Optimizing functionals on the space of probabilities with input convex neural networks.Transactions on Machine Learning Research, 2022. URL:https://openreview.net/forum?id=dpOYN7o8Jm

work page 2022

[3] [3]

Ambrosio, N

L. Ambrosio, N. Gigli, and G. Savar´ e.Gradient Flows in Metric Spaces and in the Space of Probability Measures. Lectures in Mathematics ETH Z¨ urich. Birkh¨ auser, 2. ed edition, 2008. OCLC: 254181287

work page 2008

[4] [4]

Arasaratnam and S

I. Arasaratnam and S. Haykin. Cubature Kalman filters.IEEE Transactions on Automatic Control, 54(6):1254–1269, 2009.doi:10.1109/TAC.2009.2019800

work page doi:10.1109/tac.2009.2019800 2009

[5] [5]

Bhatia, T

R. Bhatia, T. Jain, and Y. Lim. On the Bures–Wasserstein distance between positive definite matrices.Expositiones Mathematicae, 37(2):165– 191, 2019. URL: https://www.sciencedirect.com/science/article/pii/ S0723086918300021,doi:10.1016/j.exmath.2018.01.002

work page doi:10.1016/j.exmath.2018.01.002 2019

[6] [6]

A. N. Bishop and A. Doucet. Distributed nonlinear consensus in the space of probability measures.IFAC Proceedings Volumes, 47(3):8662–8668, 2014. 19th IFAC World Congress. URL: https://www.sciencedirect.com/science/article/pii/ S1474667016429800,doi:10.3182/20140824-6-ZA-1003.00341

work page doi:10.3182/20140824-6-za-1003.00341 2014

[7] [7]

A. N. Bishop and A. Doucet. Network consensus in the Wasserstein metric space of probability measures.SIAM Journal on Control and Optimization, 59(5):3261–3277, 2021.arXiv:https://doi.org/10.1137/19M1268252,doi:10.1137/19M1268252

work page doi:10.1137/19m1268252 2021

[8] [8]

D. M. Blei, A. Kucukelbir, and J. D. McAuliffe. Variational inference: A review for statisticians.Journal of the American Statistical Association, 112(518):859–877,

work page

[9] [9]

arXiv:https://doi.org/10.1080/01621459.2017.1285773, doi:10.1080/ 01621459.2017.1285773

work page doi:10.1080/01621459.2017.1285773 2017

[10] [10]

Borghi and J

G. Borghi and J. Carrillo. Variational inference via Gaussian interacting particles in the Bures-Wasserstein geometry, 2025. Media.doi:10.6084/m9.figshare.30958322

work page doi:10.6084/m9.figshare.30958322 2025

[11] [11]

Borghi, M

G. Borghi, M. Herty, and L. Pareschi. Constrained consensus-based optimization. SIAM Journal on Optimization, 33(1):211–236, 2023. arXiv:https://doi.org/10. 1137/22M1471304,doi:10.1137/22M1471304

work page doi:10.1137/22m1471304 2023

[12] [12]

Borghi, M

G. Borghi, M. Herty, and L. Pareschi. Kinetic models for optimization: A unified mathematical framework for metaheuristics.arXiv preprint arXiv:2410.10369, 2024. URL:https://arxiv.org/abs/2410.10369,arXiv:2410.10369

work page arXiv 2024

[13] [13]

Borghi, M

G. Borghi, M. Herty, and A. Stavitskiy. Dynamics of measure-valued agents in the space of probabilities.SIAM Journal on Mathematical Analysis, 57(5):5107–5134, 2025.arXiv:https://doi.org/10.1137/24M1675515,doi:10.1137/24M1675515. 23

work page doi:10.1137/24m1675515 2025

[14] [14]

Y. Brenier. Polar factorization and monotone rearrangement of vector-valued func- tions.Communications on Pure and Applied Mathematics, 44(4):375–417, 1991. URL: https://onlinelibrary.wiley.com/doi/abs/10.1002/cpa.3160440402, arXiv: https://onlinelibrary.wiley.com/doi/pdf/10.1002/cpa.3160440402, doi:10. 1002/cpa.3160440402

work page doi:10.1002/cpa.3160440402 1991

[15] [15]

T. Cai, J. Cheng, N. Craig, and K. Craig. Linearized optimal transport for collider events.Phys. Rev. D, 102:116019, Dec 2020. URL: https://link.aps.org/doi/10. 1103/PhysRevD.102.116019,doi:10.1103/PhysRevD.102.116019

work page doi:10.1103/physrevd.102.116019 2020

[16] [16]

Carlier, A

G. Carlier, A. Delalande, and Q. M´ erigot. Quantitative stability of the pushforward operation by an optimal transport map.Foundations of Computational Mathematics, 2024.doi:10.1007/s10208-024-09669-4

work page doi:10.1007/s10208-024-09669-4 2024

[17] [17]

J. A. Carrillo, Y.-P. Choi, C. Totzeck, and O. Tse. An analytical framework for consensus-based global optimization method.Math. Models Methods Appl. Sci., 28(6):1037–1066, 2018.doi:10.1142/S0218202518500276

work page doi:10.1142/s0218202518500276 2018

[18] [18]

J. A. Carrillo, S. Jin, L. Li, and Y. Zhu. A consensus-based global optimization method for high dimensional machine learning problems.ESAIM: Control, Optimisation and Calculus of Variations, 27:S5, 2021

work page 2021

[19] [19]

Chewi, T

S. Chewi, T. Maunu, P. Rigollet, and A. Stromme. Gradient descent algorithms for Bures-Wasserstein barycenters. In J. Abernethy and S. Agarwal, editors,Pro- ceedings of Thirty Third Conference on Learning Theory, volume 125 ofProceedings of Machine Learning Research, pages 1276–1304. PMLR, 09–12 Jul 2020. URL: https://proceedings.mlr.press/v125/chewi20a.html

work page 2020

[20] [20]

Cisneros-Velarde and F

P. Cisneros-Velarde and F. Bullo. Distributed Wasserstein barycenters via displace- ment interpolation.IEEE Transactions on Control of Network Systems, 10(2):785–795, 2023.doi:10.1109/TCNS.2022.3210341

work page doi:10.1109/tcns.2022.3210341 2023

[21] [21]

Delalande and Q

A. Delalande and Q. Merigot. Quantitative stability of optimal transport maps under variations of the target measure.Duke Mathematical Journal, 172(17):3321–3357, 2023

work page 2023

[22] [22]

Dembo and O

A. Dembo and O. Zeitouni.Large Deviations Techniques and Applications. Springer- Verlag Berlin Heidelberg, 2010

work page 2010

[23] [23]

M. Z. Diao, K. Balasubramanian, S. Chewi, and A. Salim. Forward-backward Gaussian variational inference via JKO in the Bures-Wasserstein space. In A. Krause, E. Brunskill, K. Cho, B. Engelhardt, S. Sabato, and J. Scarlett, editors,Proceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, p...

work page 2023

[24] [24]

Journal of Multivariate Analysis , author =

D. Dowson and B. Landau. The Fr´ echet distance between multivari- ate normal distributions.Journal of Multivariate Analysis, 12(3):450– 455, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0047259X8290077X,doi:10.1016/0047-259X(82)90077-X

work page doi:10.1016/0047-259x(82)90077-x 1982

[25] [25]

Fornasier, T

M. Fornasier, T. Klock, and K. Riedl. Consensus-based optimization methods converge globally.SIAM Journal on Optimization, 34(3):2973–3004, 2024. arXiv:https: //doi.org/10.1137/22M1527805,doi:10.1137/22M1527805

work page doi:10.1137/22m1527805 2024

[26] [26]

Gerber, F

N. Gerber, F. Hoffmann, D. Kim, and U. Vaes. Uniform-in-time propagation of chaos for consensus-based optimization.arXiv preprint arXiv:2505.08669, 2025

work page arXiv 2025

[27] [27]

M. B. Giles. Collected matrix derivative results for forward and reverse mode algorithmic differentiation. In C. H. Bischof, H. M. B¨ ucker, P. Hovland, U. Naumann, 24 and J. Utke, editors,Advances in Automatic Differentiation, pages 35–44, Berlin, Heidelberg, 2008. Springer Berlin Heidelberg

work page 2008

[28] [28]

C. R. Givens and R. M. Shortt. A class of Wasserstein metrics for probability distributions.Michigan Mathematical Journal, 31(2):231–240, 1984

work page 1984

[29] [29]

A. Han, B. Mishra, P. Jawanpuria, and J. Gao. On Riemannian optimization over positive definite matrices with the Bures–Wasserstein geometry. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, editors,Advances in Neural Information Processing Systems, 2021. URL: https://openreview.net/forum?id=ZCHxGFmc62a

work page 2021

[30] [30]

Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence

J. Holland.Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. University of Michi- gan Press, 1975. URL:https://books.google.co.uk/books?id=JE5RAAAAMAAJ

work page 1975

[31] [31]

Katsevich and P

A. Katsevich and P. Rigollet. On the approximation accuracy of Gaussian variational inference.The Annals of Statistics, 52(4):1384–1409, 2024. doi:10.1214/24-AOS2393

work page doi:10.1214/24-aos2393 2024

[32] [32]

Kennedy and R

J. Kennedy and R. Eberhart. Particle swarm optimization. InProceedings of ICNN’95- international conference on neural networks, volume 4, pages 1942–1948. ieee, 1995

work page 1942

[33] [33]

M. E. Khan and H. Rue. The Bayesian learning rule.J. Mach. Learn. Res., 24(1), Jan. 2023

work page 2023

[34] [34]

doi:10.1126/science.220.4598.671 Jeffrey C

S. Kirkpatrick, C. D. Gelatt, and M. P. Vecchi. Optimization by simulated anneal- ing.Science, 220(4598):671–680, 1983. URL: https://www.science.org/doi/abs/ 10.1126/science.220.4598.671, arXiv:https://www.science.org/doi/pdf/10. 1126/science.220.4598.671,doi:10.1126/science.220.4598.671

work page doi:10.1126/science.220.4598.671 1983

[35] [35]

Knoblauch, J

J. Knoblauch, J. Jewson, and T. Damoulas. An optimization-centric view on Bayes’ rule: Reviewing and generalizing variational inference.Journal of Machine Learning Research, 23(132):1–109, 2022. URL: http://jmlr.org/papers/v23/19-1047.html

work page 2022

[36] [36]

Kolouri, A

S. Kolouri, A. B. Tosun, J. A. Ozolek, and G. K. Rohde. A continuous linear optimal transport approach for pattern analysis in image datasets.Pattern recognition, 51:453– 462, 2016

work page 2016

[37] [37]

Lambert, S

M. Lambert, S. Chewi, F. Bach, S. Bonnabel, and P. Rigollet. Variational inference via Wasserstein gradient flows. In A. H. Oh, A. Agarwal, D. Belgrave, and K. Cho, editors,Advances in Neural Information Processing Systems, 2022. URL: https: //openreview.net/forum?id=K2PTuvVTF1L

work page 2022

[38] [38]

Liero, A

M. Liero, A. Mielke, O. Tse, and J.-J. Zhu. Evolution of Gaussians in the Hellinger-Kantorovich-Boltzmann gradient flow.Communications on Pure and Ap- plied Analysis, (early access), 2025. URL: https://www.aimsciences.org/article/ id/68ecc16acb5dde21e7544b1b,doi:10.3934/cpaa.2025105

work page doi:10.3934/cpaa.2025105 2025

[39] [39]

J. Lott. Some geometric calculations on Wasserstein space.Communications in Mathematical Physics, 277(2):423–437, 2008.doi:10.1007/s00220-007-0367-3

work page doi:10.1007/s00220-007-0367-3 2008

[40] [40]

Malag` o, L

L. Malag` o, L. Montrucchio, and G. Pistone. Wasserstein Riemannian geometry of Gaussian densities.Information Geometry, 1(2):137–179, Dec 2018. doi:10.1007/ s41884-018-0014-4

work page 2018

[41] [41]

Moosm¨ uller and A

C. Moosm¨ uller and A. Cloninger. Linear optimal transport embedding: provable Wasserstein classification for certain rigid transformations and perturbations.Infor- mation and Inference: A Journal of the IMA, 12(1):363–389, 2023

work page 2023

[42] [42]

Olkin and F

I. Olkin and F. Pukelsheim. The distance between two random vec- tors with given dispersion matrices.Linear Algebra and its Applications, 48:257–263, 1982. URL: https://www.sciencedirect.com/science/article/pii/ 0024379582901124,doi:10.1016/0024-3795(82)90112-4. 25

work page doi:10.1016/0024-3795(82)90112-4 1982

[43] [43]

F. Otto. The geometry of dissipative evolution equations: The porous medium equation.Communications in Partial Differential Equations, 26(1-2):101–174, 2001. doi:10.1081/PDE-100002243

work page doi:10.1081/pde-100002243 2001

[44] [44]

Generalization of an Inequality by Talagrand and Links with the Logarithmic Sobolev Inequality

F. Otto and C. Villani. Generalization of an inequality by Talagrand and links with the logarithmic Sobolev inequality.Journal of Functional Analysis, 173(2):361–400, 2000. URL: https://www.sciencedirect.com/science/article/ pii/S0022123699935577,doi:10.1006/jfan.1999.3557

work page doi:10.1006/jfan.1999.3557 2000

[45] [45]

Pennec, P

X. Pennec, P. Fillard, and N. Ayache. A Riemannian framework for tensor computing.International Journal of Computer Vision, 66(1):41–66, 2006. doi: 10.1007/s11263-005-3222-z

work page doi:10.1007/s11263-005-3222-z 2006

[46] [46]

Pettersson

R. Pettersson. Projection scheme for stochastic differential equations with convex constraints.Stochastic Processes and their Applications, 88(1):125– 134, 2000. URL: https://www.sciencedirect.com/science/article/pii/ S0304414999001210,doi:10.1016/S0304-4149(99)00121-0

work page doi:10.1016/s0304-4149(99)00121-0 2000

[47] [47]

Pilipenko.An introduction to stochastic differential equations with reflection

A. Pilipenko.An introduction to stochastic differential equations with reflection. Universit¨ atsverlag Potsdam, 2014

work page 2014

[48] [48]

Pinnau and C

R. Pinnau, C. Totzeck, O. Tse, and S. Martin. A consensus-based model for global optimization and its mean-field limit.Math. Models Methods Appl. Sci., 27(1):183–204, 2017.doi:10.1142/S0218202517400061

work page doi:10.1142/s0218202517400061 2017

[49] [49]

Polyanskiy and Y

Y. Polyanskiy and Y. Wu. Wasserstein continuity of entropy and outer bounds for interference channels.IEEE Transactions on Information Theory, 62(7):3992–4002, 2016

work page 2016

[50] [50]

Ren and F.-Y

P. Ren and F.-Y. Wang. Ornstein–Uhlenbeck type processes on Wasserstein spaces. Stochastic Processes and their Applications, 172:104339, 2024. doi:10.1016/j.spa. 2024.104339

work page doi:10.1016/j.spa 2024

[51] [51]

R¨ uschendorf and L

L. R¨ uschendorf and L. Uckelmann. On the n-coupling problem.Journal of Multivariate Analysis, 81(2):242–258, 2002. URL: https://www.sciencedirect.com/science/ article/pii/S0047259X01920056,doi:10.1006/jmva.2001.2005

work page doi:10.1006/jmva.2001.2005 2002

[52] [52]

Santambrogio.Optimal Transport for Applied Mathematicians

F. Santambrogio.Optimal Transport for Applied Mathematicians. Birkh¨ auser, 2015

work page 2015

[53] [53]

Sarrazin and B

C. Sarrazin and B. Schmitzer. Linearized optimal transport on manifolds.SIAM Journal on Mathematical Analysis, 56(4):4970–5016, 2024. arXiv:https://doi.org/ 10.1137/23M1564535,doi:10.1137/23M1564535

work page doi:10.1137/23m1564535 2024

[54] [54]

SIAM Rev.58(3), 377–441 (2016)

V. Simoncini. Computational methods for linear matrix equations.SIAM Review, 58(3):377–441, 2016. arXiv:https://doi.org/10.1137/130912839, doi:10.1137/ 130912839

work page doi:10.1137/130912839 2016

[55] [55]

Spokoiny and M

V. Spokoiny and M. Panov. Accuracy of Gaussian approximation for high-dimensional posterior distributions.Bernoulli, 31(2):843 – 867, 2025. doi:10.3150/21-BEJ1412

work page doi:10.3150/21-bej1412 2025

[56] [56]

Sznitman

A.-S. Sznitman. Topics in propagation of chaos. InEcole d’´ et´ e de probabilit´ es de Saint-Flour XIX—1989, pages 165–251. Springer, 1991

work page 1989

[57] [57]

A. Takatsu. Wasserstein geometry of Gaussian measures.Osaka Journal of Mathe- matics, 48(4):1005 – 1026, 2011

work page 2011

[58] [58]

Thanwerdas and X

Y. Thanwerdas and X. Pennec. Bures–Wasserstein minimizing geodesics between covariance matrices of different ranks.SIAM Journal on Matrix Analysis and Ap- plications, 44(3):1447–1476, 2023. arXiv:https://doi.org/10.1137/22M149168X, doi:10.1137/22M149168X. 26

work page doi:10.1137/22m149168x 2023

[59] [59]

, journal =

L. Tierney and J. B. Kadane. Accurate approximations for posterior mo- ments and marginal densities.Journal of the American Statistical Associa- tion, 81(393):82–86, 1986. URL: https://www.tandfonline.com/doi/abs/10. 1080/01621459.1986.10478240, arXiv:https://www.tandfonline.com/doi/pdf/ 10.1080/01621459.1986.10478240,doi:10.1080/01621459.1986.10478240

work page doi:10.1080/01621459.1986.10478240 1986

[60] [60]

A. Uhlmann. The metric of Bures and the geometric phase.Quantum groups and related topics, pages 267–264, 1992

work page 1992

[61] [61]

Vayer and R

T. Vayer and R. Gribonval. Controlling Wasserstein distances by kernel norms with application to compressive statistical learning.Journal of Machine Learning Research, 24(149):1–51, 2023. URL:http://jmlr.org/papers/v24/21-1516.html

work page 2023

[62] [62]

Bernoulli , author =

Y. Zemel and V. M. Panaretos. Fr´ echet means and Procrustes analysis in Wasserstein space.Bernoulli, 25(2):932 – 976, 2019.doi:10.3150/17-BEJ1009

work page doi:10.3150/17-bej1009 2019

[63] [63]

P. C. ´Alvarez Esteban, E. del Barrio, J. Cuesta-Albertos, and C. Matr´ an. A fixed-point approach to barycenters in Wasserstein space.Journal of Mathematical Analysis and Applications, 441(2):744–762, 2016. URL: https://www.sciencedirect.com/ science/article/pii/S0022247X16300907,doi:10.1016/j.jmaa.2016.04.045. 27

work page doi:10.1016/j.jmaa.2016.04.045 2016