pith. sign in

arxiv: 2605.08424 · v1 · submitted 2026-05-08 · 💻 cs.LG · math.OC· math.PR

Generalized Wasserstein Flow Matching: Transport Plans, Everywhere, All at Once

Pith reviewed 2026-05-12 00:48 UTC · model grok-4.3

classification 💻 cs.LG math.OCmath.PR
keywords flow matchingWasserstein distancegenerative modelingtransport plansmetameasurespoint cloud generationset generation
0
0 comments X

The pith

Wasserstein flow matching extends to measures over measures using coupled outer and inner transport plans.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends flow matching from single probability measures to the space of measures over probability measures. It introduces a Wasserstein-on-Wasserstein formulation where measures over transport plans induce velocity fields that drive flows between these higher-level objects. A sympathetic reader would care because this creates a unified way to generate complex structured data such as point clouds or collections of sets by learning deterministic transport at two nested levels rather than relying on separate models for each layer.

Core claim

Leveraging the nested Wasserstein geometry, measures over transport plans naturally induce velocity fields that realize metameasure flows. This yields a principled generalization of Wasserstein flow matching via coupled outer and inner transport plans. Scalable approximations based on sliced and linear Wasserstein distances enable efficient training while promoting numerically stable, near-straight trajectories.

What carries the argument

The Wasserstein-on-Wasserstein (WoW) formulation that couples outer transport plans between metameasures with inner transport plans between measures.

If this is right

  • Point cloud and set generation methods become special cases of a single coupled-plan framework.
  • Training remains efficient for high-dimensional data where exact Wasserstein-on-Wasserstein computation is intractable.
  • The resulting flows produce deterministic transport dynamics between metameasures rather than stochastic alternatives.
  • Existing flow matching techniques extend directly to hierarchical data without architectural changes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same nesting idea could apply to other optimal transport geometries beyond Wasserstein for modeling multi-level distributions.
  • Practitioners working on conditional generation of sets might replace separate encoders with this single coupled-plan objective.
  • The near-straight trajectories suggest the method could serve as a drop-in replacement for diffusion models on structured data once scaled.

Load-bearing premise

That approximations based on sliced and linear Wasserstein distances preserve the theoretical properties of the metameasure flows while delivering numerically stable near-straight trajectories.

What would settle it

An experiment in which the learned trajectories under the sliced or linear approximations deviate substantially from straight lines or produce generated metameasures whose Wasserstein distance to the target exceeds that of standard flow matching baselines.

Figures

Figures reproduced from arXiv: 2605.08424 by Gabriele Steidl, Moritz Piening, Richard Duong.

Figure 1
Figure 1. Figure 1: Learned trajectories for moving horizontally aligned source circles (bottom) to larger [PITH_FULL_IMAGE:figures/full_fig_p012_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Generated ShapeNet planes based on ind, SW and LLW outer couplings. The first setting (2a) corresponds to the setting of Wasserstein (or SN -equivariant) flow matching [20, 29]. All predictions are based on the same fixed samples from our barycentric source. While all models produce similar outputs, zooming in on Subfigure 2a shows more noise artifacts at N = 2048. (σ = 0.15, σ = 0.45) and train each model… view at source ↗
Figure 3
Figure 3. Figure 3: visualizes learned interpolations via kernel density estimation [46], see Appendix C.8. While (OPdW,IPcW) and (OPdSW,IPcSW) preserve digit structure at the midpoint (t = 0.5), the independent coupling (OPdind,IPcind) yields amorphous intermediate densities. (a) N = 128 (b) N = 4096 [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Learned paths for (OPdW, IPcW) and B = 4, 8, 16. Larger batches lead to straigther paths. B.3 Ablation of Sliced Wasserstein Solver The quality of any sliced Wasserstein approximation is directly linked to the number of random projections ♯ Slices in (18). In order to study its impact, we repeat the experiment from Section 4.2, but now with our sliced transport plan estimators (OPdSW, IPcSW), and we vary ♯… view at source ↗
Figure 5
Figure 5. Figure 5: Learned paths for (OPdSW, IPcSW) and ♯ Slices = 2, 8, 32. More projections lead to straigther paths. B.4 Ablation of Wasserstein Solver Notably, we mainly relied on exact linear program solvers for our Wasserstein OT estimators due to the availability of highly efficent implementations [16]. Especially for large-scale problems, such exact solvers are often replaced by Sinkhorn solvers that solve an entropi… view at source ↗
Figure 6
Figure 6. Figure 6: Learned paths for (OPdW, IPcW) with Sinkhorn regularizer reg = 0.01, 0.1, 1. Less regu￾larization leads to straigher paths, but increases training time. B.5 Ablation of Transport Solver Runtime To quantify the computational overhead of different transport solver combinations, we measure the wall-clock time required by the transport solver per training step for varying cloud size N and batch size B. Concret… view at source ↗
Figure 7
Figure 7. Figure 7: 2D histogram visualization based on true and generated MNIST point clouds ( [PITH_FULL_IMAGE:figures/full_fig_p036_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: 2D histogram visualization of the computed MNIST barycenter employed as the center [PITH_FULL_IMAGE:figures/full_fig_p036_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: True and generated ShapeNet airplanes for discretizatization [PITH_FULL_IMAGE:figures/full_fig_p037_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: 2D scatter plot visualization of the point cloud flow learned using the WoW setup dc [PITH_FULL_IMAGE:figures/full_fig_p037_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: 2D histogram visualization based on true and generated USPS point clouds for dc [PITH_FULL_IMAGE:figures/full_fig_p038_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Same visualization as in Figure 3 with more samples for [PITH_FULL_IMAGE:figures/full_fig_p038_12.png] view at source ↗
read the original abstract

Flow matching has recently emerged as a flexible and efficient framework for generative modelling by learning deterministic transport dynamics between probability measures. In this work, we extend flow matching to the space of probability measures over probability measures, introducing a Wasserstein-on-Wasserstein (WoW) formulation. Leveraging the nested Wasserstein geometry, we show that measures over transport plans naturally induce velocity fields that realize metameasure flows. This yields a principled generalization of Wasserstein flow matching via coupled outer and inner transport plans. To address the substantial computational cost of WoW transport, we propose scalable approximations based on sliced and linear Wasserstein distances, enabling efficient training while promoting numerically stable, near-straight trajectories. Our framework unifies and extends existing approaches to point cloud and set generation, providing a practical and theoretically grounded method for generative modelling in WoW spaces.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper extends flow matching to the space of probability measures over probability measures by introducing a Wasserstein-on-Wasserstein (WoW) formulation. Leveraging nested Wasserstein geometry, it claims that measures over transport plans induce velocity fields realizing metameasure flows through coupled outer and inner transport plans. Scalable approximations based on sliced and linear Wasserstein distances are proposed to mitigate computational costs while aiming for stable, near-straight trajectories. The framework is positioned as a unification and extension of existing methods for point cloud and set generation.

Significance. If the central theoretical claims on metameasure flows hold and the approximations are shown to preserve key properties, the work could provide a principled extension of flow matching to higher-order measure spaces. This would offer a unified theoretical lens for generative tasks involving sets and point clouds, potentially improving stability and interpretability over ad-hoc extensions. The emphasis on coupled transport plans and nested geometry is a clear strength in grounding the generalization.

major comments (2)
  1. [§3] §3 (theoretical development): The core claim that 'measures over transport plans naturally induce velocity fields that realize metameasure flows' via nested Wasserstein geometry is load-bearing for the generalization, yet the manuscript provides no explicit derivation, regularity conditions, or proof outline showing how the outer measure induces the inner velocity field; this must be supplied with key steps to substantiate the result.
  2. [§5] §5 (approximations): The assertion that sliced and linear Wasserstein approximations preserve the theoretical properties of the metameasure flows while yielding numerically stable trajectories is central to the practical contribution, but lacks error analysis, convergence guarantees, or ablation studies demonstrating that the approximations do not distort the coupled outer/inner plans; without this, the claim that they enable efficient training without sacrificing the framework's advantages is unsupported.
minor comments (2)
  1. The abstract and introduction would benefit from a concise statement of the main theorem or proposition number that formalizes the velocity field induction, to help readers locate the key result.
  2. [§2] Notation for the outer and inner transport plans (e.g., distinction between Π and π) should be introduced with a dedicated table or equation block early in §2 to improve readability across the nested geometry sections.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and positive assessment of the potential impact of our work. We address each major comment below.

read point-by-point responses
  1. Referee: [§3] §3 (theoretical development): The core claim that 'measures over transport plans naturally induce velocity fields that realize metameasure flows' via nested Wasserstein geometry is load-bearing for the generalization, yet the manuscript provides no explicit derivation, regularity conditions, or proof outline showing how the outer measure induces the inner velocity field; this must be supplied with key steps to substantiate the result.

    Authors: We agree that an explicit derivation with regularity conditions and a proof outline is required to fully substantiate the central claim. While the manuscript states the result, the detailed steps showing how the outer measure induces the inner velocity field via the nested geometry were not expanded sufficiently in §3. In the revised manuscript we will add a dedicated proof sketch in §3, including assumptions (e.g., compactly supported measures with finite second moments and optimal inner transport plans) and the key differentiation steps under the coupled outer/inner plans. revision: yes

  2. Referee: [§5] §5 (approximations): The assertion that sliced and linear Wasserstein approximations preserve the theoretical properties of the metameasure flows while yielding numerically stable trajectories is central to the practical contribution, but lacks error analysis, convergence guarantees, or ablation studies demonstrating that the approximations do not distort the coupled outer/inner plans; without this, the claim that they enable efficient training without sacrificing the framework's advantages is unsupported.

    Authors: We acknowledge that the current manuscript does not contain formal error bounds, convergence guarantees, or targeted ablations for the sliced and linear approximations. In the revision we will expand §5 with approximation-error bounds (in terms of projection count for sliced Wasserstein and linearization parameter) that respect the nested geometry, together with convergence statements under suitable conditions. We will also add ablation experiments comparing exact WoW, sliced, and linear variants on trajectory straightness and fidelity of the coupled plans. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's core claim—that measures over transport plans induce velocity fields realizing metameasure flows via nested Wasserstein geometry—is presented as a direct mathematical consequence of the established nested structure and flow matching framework. This is not reduced to fitted parameters, self-definitional loops, or load-bearing self-citations; the result follows from the geometry without the target being presupposed in the inputs. Scalable approximations (sliced/linear Wasserstein) are introduced separately for computation and do not retroactively define the theoretical flows. The unification of point cloud/set generation is an extension, not a renaming of known results. The derivation remains self-contained against external Wasserstein and flow matching benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that nested Wasserstein geometry induces well-defined velocity fields on the space of measures over transport plans; no explicit free parameters or invented entities are stated in the abstract.

axioms (1)
  • domain assumption Nested Wasserstein geometry induces velocity fields that realize metameasure flows
    Invoked to justify the extension of flow matching to WoW spaces.

pith-pipeline@v0.9.0 · 5444 in / 1148 out tokens · 69923 ms · 2026-05-12T00:48:26.268267+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

71 extracted references · 71 canonical work pages

  1. [1]

    In: Proceedings of the ICLR’23

    Albergo, M., Vanden-Eijnden, E.: Building normalizing flows with stochastic interpolants. In: Proceedings of the ICLR’23. OpenReview.net (2023)

  2. [2]

    Springer Science & Business Media (2005)

    Ambrosio, L., Gigli, N., Savaré, G.: Gradient Flows: In Metric Spaces and in the Space of Probability Measures. Springer Science & Business Media (2005)

  3. [3]

    In: Proceedings of the ICLR’25

    Atanackovic, L., Zhang, X., Amos, B., Blanchette, M., Lee, L.J., Bengio, Y., Tong, A., Nek- lyudov, K.: Meta flow matching: Integrating vector fields on the Wasserstein manifold. In: Proceedings of the ICLR’25. OpenReview.net (2025)

  4. [4]

    Beier, F., Beinert, R., Steidl, G.: OnalinearGromov–Wassersteindistance.IEEETransactions on Image Processing31, 7292–7305 (2022)

  5. [5]

    arXiv preprint arXiv:2509.03506 (2025)

    Beiglböck, M., Pammer, G., Schrott, S.: A Brenier theorem on(p_2(...p_2(h)...), w_2)and applications to adapted transport. arXiv preprint arXiv:2509.03506 (2025)

  6. [6]

    In: Proceedings of ICML’25

    Bonet, C., Vauthier, C., Korba, A.: Flowing datasets with Wasserstein over Wasserstein gradient flows. In: Proceedings of ICML’25. OpenReview.net (2025)

  7. [7]

    arXiv preprint arXiv:2510.04579 (2025)

    Bonet, C., Cazelles, E., Drumetz, L., Courty, N.: Busemann functions in the Wasserstein space: existence, closed-forms, and applications to slicing. arXiv preprint arXiv:2510.04579 (2025)

  8. [8]

    Bonnotte, N.: Unidimensional and Evolution Methods for Optimal Transportation. Ph.D. thesis, Université Paris Sud–Paris XI, Orsay, France (2013), phD Thesis

  9. [9]

    Cambridge University Press (2023)

    Boumal, N.: An Introduction to Optimization on Smooth Manifolds. Cambridge University Press (2023)

  10. [10]

    Journal of Machine Learning Research26(141), 1–47 (2025)

    Chemseddine, J., Hagemann, P., Steidl, G., Wald, C.: Conditional Wasserstein distances with applications in Bayesian OT flow matching. Journal of Machine Learning Research26(141), 1–47 (2025)

  11. [11]

    OpenReview.net (2026)

    Chemseddine, J., Kornhardt, G., Duong, R., Steidl, G.: Adapting noise to data by quantile learning.In: ProceedingsoftheDeepGenerativeModelinMachineLearning: Theory, Principle and Efficacy ICLR Workshop. OpenReview.net (2026)

  12. [12]

    OpenReview.net (2024)

    Chen, R.T., Lipman, Y.: Flowmatchingongeneralgeometries.In: ProceedingsoftheICLR’24. OpenReview.net (2024)

  13. [13]

    In: Advances in Neural Information Processing Systems

    Cuturi, M.: Sinkhorn distances: Lightspeed computation of optimal transport. In: Advances in Neural Information Processing Systems. vol. 26. Curran Associcates (2013) 15

  14. [14]

    Advances in Neural Information Processing Systems37, 103384–103441 (2024)

    DeBortoli, V., Korshunova, I., Mnih, A., Doucet, A.: Schrodingerbridgeflowforunpaireddata translation. Advances in Neural Information Processing Systems37, 103384–103441 (2024)

  15. [15]

    Emami, P., Pass, B.: Optimal transport with optimal transport cost: the Monge–Kantorovich problemonWassersteinspaces.CalculusofVariationsandPartialDifferentialEquations64(2), 43 (2025)

  16. [16]

    Journal of Machine Learning Research 22(78), 1–8 (2021), software available at % urlhttps://pythonot.github.io/

    Flamary, R., Courty, N., Gramfort, A., Alaya, M.Z., Boisbunon, A., Chambon, S., Chapel, L., Corenflos, A., Fatras, K., Fournier, N., Gautheron, L., Gayraud, N.T.H., Janati, H., Rako- tomamonjy, A., Redko, I., Rolet, A., Schutz, A., Seguy, V., Sutherland, D.J., Tavenard, R., Tong, A., Vayer, T.: POT: Python optimal transport. Journal of Machine Learning Re...

  17. [17]

    In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024),https://openreview.net/forum?id=GTDKo3Sv9p

    Gat, I., Remez, T., Shaul, N., Kreuk, F., Chen, R.T.Q., Synnaeve, G., Adi, Y., Lipman, Y.: Discrete flow matching. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024),https://openreview.net/forum?id=GTDKo3Sv9p

  18. [18]

    Computational Visual Media7(2), 187–199 (2021)

    Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: Pct: Point cloud trans- former. Computational Visual Media7(2), 187–199 (2021)

  19. [19]

    Inverse Problems37(8), 085002 (2021)

    Hagemann, P., Neumayer, S.: Stabilizing invertible neural networks using mixture models. Inverse Problems37(8), 085002 (2021)

  20. [20]

    In: Proceedings of the ICML’25

    Haviv, D., Pooladian, A.A., Pe’er, D., Amos, B.: Wasserstein flow matching: Generative mod- eling over families of distributions. In: Proceedings of the ICML’25. OpenReview.net (2025)

  21. [21]

    Advances in Neural Information Processing Systems33, 6840–6851 (2020)

    Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems33, 6840–6851 (2020)

  22. [22]

    In: Proceedings of the ICLR’25

    Holderrieth, P., Havasi, M., Yim, J., Shaul, N., Gat, I., Jaakkola, T., Karrer, B., Chen, R.T., Lipman, Y.: Generator matching: Generative modeling with arbitrary Markov processes. In: Proceedings of the ICLR’25. OpenReview.net (2025)

  23. [23]

    Stochastic Processes and their Applications185, 104633 (2025)

    Huesmann, M., Müller, B.: A Benamou–Brenier formula for transport distances between sta- tionary random measures. Stochastic Processes and their Applications185, 104633 (2025)

  24. [24]

    In: Proceedings of the ICLR’25

    Hui, K.H., Liu, C., Zeng, X., Fu, C.W., Vahdat, A.: Not-so-optimal transport flows for 3d point cloud generation. In: Proceedings of the ICLR’25. OpenReview.net (2025)

  25. [25]

    IEEE Transactions on Pattern Analysis and Machine Intelligence16(5), 550–554 (1994)

    Hull, J.J.: A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine Intelligence16(5), 550–554 (1994). https://doi.org/10.1109/34.291440

  26. [26]

    In: Proceedings of the Generative AI and Biology ICML Workshop

    Jiang, K., Cui, J., Dong, X., Toni, L.: Bures-Wasserstein flow matching for graph generation. In: Proceedings of the Generative AI and Biology ICML Workshop. OpenReview.net (2025)

  27. [27]

    In: Proceedings of the AISTATS’24

    Kerrigan, G., Migliorini, G., Smyth, P.: Functional flow matching. In: Proceedings of the AISTATS’24. pp. 3934–3942. PMLR (2024)

  28. [28]

    In: Proceedings of the ICLR’24

    Kim, B., Kwon, G., Kim, K., Ye, J.C.: Unpaired image-to-image translation via neural schrödinger bridge. In: Proceedings of the ICLR’24. OpenReview.net (2024) 16

  29. [29]

    Advances in Neural Information Processing Systems36, 59886–59910 (2023)

    Klein, L., Krämer, A., Noé, F.: Equivariant flow matching. Advances in Neural Information Processing Systems36, 59886–59910 (2023)

  30. [30]

    In: Proceedings of the ICLR’23

    Korotin, A., Selikhanovych, D., Burnaev, E.: Neural optimal transport. In: Proceedings of the ICLR’23. OpenReview.net (2023)

  31. [31]

    In: Proceedings of the IEEE

    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. In: Proceedings of the IEEE. vol. 86, pp. 2278–2324. IEEE (1998)

  32. [32]

    In: Proceedings of the ICML’19

    Lee, J., Lee, Y., Kim, J., Kosiorek, A., Choi, S., Teh, Y.W.: Set transformer: A framework for attention-based permutation-invariant neural networks. In: Proceedings of the ICML’19. pp. 3744–3753. PMLR (2019)

  33. [33]

    arXiv preprint arXiv:2501.17770 (2025)

    Li, Y., Liu, C., Schönlieb, C.B.: Generative unordered flow for set-structured data generation. arXiv preprint arXiv:2501.17770 (2025)

  34. [34]

    In: Advances in Neural Information Processing Systems

    Lin, T., Ho, N., Chen, X., Cuturi, M., Jordan, M.: Fixed-support Wasserstein barycenters: Computational hardness and fast algorithm. In: Advances in Neural Information Processing Systems. vol. 33, pp. 5368–5380. Curran Associates (2020)

  35. [35]

    Computational Optimization and Applications85(1), 213–246 (2023)

    Lindheim, J.v.: Simple approximative algorithms for free-support Wasserstein barycenters. Computational Optimization and Applications85(1), 213–246 (2023)

  36. [36]

    In: The Eleventh International Conference on Learning Representations (2023), https://openreview.net/forum?id=PqvMRDCJT9t

    Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: The Eleventh International Conference on Learning Representations (2023), https://openreview.net/forum?id=PqvMRDCJT9t

  37. [37]

    In: Proceeedings of the ICLR’23

    Lipman, Y., Chen, R.T., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: Proceeedings of the ICLR’23. OpenReview.net (2023)

  38. [38]

    In: Proceedings of the ICLR’23

    Liu, X., Gong, C., et al.: Flow straight and fast: Learning to generate and transfer data with rectified flow. In: Proceedings of the ICLR’23. OpenReview.net (2023)

  39. [39]

    In: Proceedings of the ICLR’25 (2025)

    Liu, X., Martin, R.D., Bai, Y., Shahbazi, A., Thorpe, M., Aldroubi, A., Kolouri, S.: Expected sliced transport plans. In: Proceedings of the ICLR’25 (2025)

  40. [40]

    In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=4anfpHj0wf

    Lüdke, D., Raventós, E.R., Kollovieh, M., Günnemann, S.: Unlocking point processes through point set diffusion. In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=4anfpHj0wf

  41. [41]

    In: Proceedings of the CVPR’21

    Luo, S., Hu, W.: Diffusion probabilistic models for 3d point cloud generation. In: Proceedings of the CVPR’21. pp. 2837–2845. IEEE (2021)

  42. [42]

    Y ., Klein, M., and Cu- turi, M

    Mousavi-Hosseini, A., Zhang, S.Y., Klein, M., Cuturi, M.: Flow matching with semidiscrete couplings. arXiv preprint arXiv:2509.25519 (2025)

  43. [43]

    Pattern Recognition138, 109351 (2023)

    Nguyen, D.H., Tsuda, K.: On a linear fused Gromov–Wasserstein distance for graph structured data. Pattern Recognition138, 109351 (2023)

  44. [44]

    In: Proceedings of ICML’25

    Nguyen, K., Nguyen, H., Pham, T., Ho, N.: Lightspeed geometric dataset distance via sliced optimal transport. In: Proceedings of ICML’25. OpenReview.net (2025) 17

  45. [45]

    In: Proceedings of the ICLR’25

    Pandey, K., Pathak, J., Xu, Y., Mandt, S., Pritchard, M., Vahdat, A., Mardani, M.: Heavy- tailed diffusion models. In: Proceedings of the ICLR’25. OpenReview.net (2025)

  46. [46]

    The Annals of Mathe- matical Statistics33(3), 1065–1076 (1962)

    Parzen, E.: On estimation of a probability density function and mode. The Annals of Mathe- matical Statistics33(3), 1065–1076 (1962)

  47. [47]

    Peyré, G., Cuturi, M.: Computational Optimal Transport: With Applications to Data Science, Foundations and Trends®in Machine Learning, vol. 11. Now Publishers (2019). https://doi.org/10.1561/2200000073

  48. [48]

    A novel sliced fused gromov-wasserstein distance.arXiv:2508.02364, 2025

    Piening, M., Beinert, R.: A novel sliced fused Gromov-Wasserstein distance. arXiv preprint arXiv:2508.02364 (2025)

  49. [49]

    Transactions on Machine Learning Research (2025)

    Piening, M., Beinert, R.: Slicing the Gaussian mixture Wasserstein distance. Transactions on Machine Learning Research (2025)

  50. [50]

    arXiv preprint arXiv:2509.22138 (2025)

    Piening, M., Beinert, R.: Slicing Wasserstein Over Wasserstein via functional optimal trans- port. arXiv preprint arXiv:2509.22138 (2025)

  51. [51]

    Nested superposition principle for random measures and the geometry of the

    Pinzi, A., Savaré, G.: Nested superposition principle for random measures and the geometry of the Wasserstein on Wasserstein space. arXiv preprint arXiv:2510.07523 (2025)

  52. [52]

    In: Proceedings of the ICML’23

    Pooladian, A.A., Ben-Hamu, H., Domingo-Enrich, C., Amos, B., Lipman, Y., Chen, R.T.: Multisample flow matching: Straightening flows with minibatch couplings. In: Proceedings of the ICML’23. OpenReview.net (2023)

  53. [53]

    In: Proceedings of the CVPR’17

    Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: Deep learning on point sets for 3d classifi- cation and segmentation. In: Proceedings of the CVPR’17. pp. 652–660. IEEE (2017)

  54. [54]

    arXiv preprint arXiv:2603.24829 (2026)

    Ruscelli, F.: Flow matching on homogeneous spaces. arXiv preprint arXiv:2603.24829 (2026)

  55. [55]

    Birkhäuser, Cham (2015)

    Santambrogio, F.: Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs and Modeling. Birkhäuser, Cham (2015)

  56. [56]

    SIAM Journal on Mathematical Analysis56(4), 4970–5016 (2024)

    Sarrazin, C., Schmitzer, B.: Linearized optimal transport on manifolds. SIAM Journal on Mathematical Analysis56(4), 4970–5016 (2024)

  57. [57]

    Sato, R., Cuturi, M., Yamada, M., Kashima, H.: Fast and robust comparison of probability measures in heterogeneous spaces (2020), arXiv:2002.01615

  58. [58]

    John Wiley & Sons, 2 edn

    Scott, D.W.: Multivariate Density Estimation: Theory, Practice, and Visualization. John Wiley & Sons, 2 edn. (2015)

  59. [59]

    In: Proceedings of the ICLR’26

    Tang, X., Zhang, B., Wonka, P.: Generative human geometry distribution. In: Proceedings of the ICLR’26. OpenReview.net (2026)

  60. [60]

    Transactions on Machine Learning Research (2023)

    Tong, A., Fatras, K., Malkin, N., Huguet, G., Zhang, Y., Rector-Brooks, J., Wolf, G., Bengio, Y.: Improving and generalizing flow-based generative models with minibatch optimal trans- port. Transactions on Machine Learning Research (2023)

  61. [61]

    Transactions on Machine Learning Research (2024) 18

    Tong, A., Fatras, K., Malkin, N., Huguet, G., Zhang, Y., Rector-Brooks, J., Wolf, G., Bengio, Y.: Improving and generalizing flow-based generative models with minibatch optimal trans- port. Transactions on Machine Learning Research (2024) 18

  62. [62]

    Advances in Neural Information Processing Systems 30(2017)

    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in Neural Information Processing Systems 30(2017)

  63. [63]

    Villani, C.: Optimal Transport: Old and New, Grundlehren der mathematischen Wis- senschaften, vol. 338. Springer-Verlag, Berlin (2009)

  64. [64]

    Variational and Information Flows in Machine Learning and Optimal Transport pp

    Wald, C., Steidl, G.: Flow matching: Markov kernels, stochastic processes and transport plans. Variational and Information Flows in Machine Learning and Optimal Transport pp. 185–254 (2025)

  65. [65]

    International Journal of Computer Vision101(2), 254–269 (2013)

    Wang, W., Slepčev, D., Basu, S., Ozolek, J.A., Rohde, G.K.: A linear optimal transportation framework for quantifying and visualizing variations in sets of images. International Journal of Computer Vision101(2), 254–269 (2013)

  66. [66]

    In: Proceedings of the ICCV’19

    Yang, G., Huang, X., Hao, Z., Liu, M.Y., Belongie, S., Hariharan, B.: Pointflow: 3d point cloud generation with continuous normalizing flows. In: Proceedings of the ICCV’19. pp. 4541–4550. IEEE (2019)

  67. [67]

    ACM computing surveys56(4), 1–39 (2023)

    Yang, L., Zhang, Z., Song, Y., Hong, S., Xu, R., Zhao, Y., Zhang, W., Cui, B., Yang, M.H.: Diffusion models: A comprehensive survey of methods and applications. ACM computing surveys56(4), 1–39 (2023)

  68. [68]

    ACM Transactions on Graphics (ToG)35(6), 1–12 (2016)

    Yi, L., Kim, V.G., Ceylan, D., Shen, I.C., Yan, M., Su, H., Lu, C., Huang, Q., Sheffer, A., Guibas, L.: A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics (ToG)35(6), 1–12 (2016)

  69. [69]

    In: Proceedings of the ICML’25

    Yiming, Q., Madeira, M., Thanou, D., Frossard, P.: DeFoG: Discrete flow matching for graph generation. In: Proceedings of the ICML’25. OpenReview.net (2025) 19 A Background and Proofs In order to describe the continuity equation (4) in the WoW setting, we need to define test functions suitable for testing on the infinite dimensional spaceP(Rd). Definition...

  70. [70]

    Note that the upper bound estimate is due to [51, Proposition 3.1]

    We prove an upper and lower bound estimate, resulting in the claimed identity. Note that the upper bound estimate is due to [51, Proposition 3.1]. For completeness, we include it here. Upper bound ‘≤’ .LetOP∗ ∈C(µ,ν)realize the WoW distance via W2 2(µ,ν) = Z P2(Rd)×P2(Rd) W2 2(µ, ν) d OP∗(µ, ν). By a measurable selection theorem [63, Cor. 5.22], there exi...

  71. [71]

    First, we assemble theB×Bcost matrix C∈R B×B , C i,i′ :=D(ˆµi,ˆνi′)

    and a batch of source and target measures(ˆµi)B i=1,(ˆνi′)B i′=1, we compute the outer transport plan as follows. First, we assemble theB×Bcost matrix C∈R B×B , C i,i′ :=D(ˆµi,ˆνi′). We then solve the resulting discrete optimal transport problem ˆΠOP ∈arg min Π∈RB×B ≥0 BX i,i′=1 Ci,i′ Πi,i′ s.t. X i′ Πi,i′ = X i Πi,i′ = 1 B , using an exact linear program...