Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

Gr\'egoire Montavon; Jacob Kauffmann; Klaus-Robert M\"uller; Philip Naumann

arxiv: 2605.04965 · v1 · submitted 2026-05-06 · 💻 cs.LG · cs.AI

Reliable Modeling of Distribution Shifts via Displacement-Reshaped Optimal Transport

Philip Naumann , Jacob Kauffmann , Klaus-Robert M\"uller , Gr\'egoire Montavon This is my paper

Pith reviewed 2026-05-08 17:25 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords optimal transportdistribution shiftsMahalanobis distancedisplacementsground metricdomain adaptationcost matrix

0 comments

The pith

ReshapeOT reshapes the ground metric in optimal transport using observed displacements to achieve more reliable modeling of distribution shifts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Displacement-Reshaped Optimal Transport (ReshapeOT) as a way to improve how optimal transport models changes between distributions. It does so by replacing the standard Euclidean distance with a Mahalanobis distance calculated from the second moments of sample displacements. This creates preferred paths in the space that match the observed movements, leading to transport plans that better respect the true geometry of the shift. A reader would care if they work with data where distributions move in correlated ways, as this can enhance applications like transferring knowledge between different datasets. The approach is simple to implement on top of existing solvers.

Core claim

ReshapeOT replaces the Euclidean metric with a Mahalanobis distance estimated from displacement second moments. This effectively carves expressways through the input space, inviting transport solutions that better align with observed displacements. The method is computationally lightweight, integrates seamlessly into any OT solver that operates on a cost matrix, and can be kernelized for further flexibility.

What carries the argument

Displacement-Reshaped Optimal Transport (ReshapeOT), which integrates observed sample displacements to reshape the ground metric as a Mahalanobis distance from their second moments.

If this is right

Transport solutions more reliably capture the geometry of real distribution shifts.
Substantial gains in reliability demonstrated on synthetic and real-world data.
Easy integration into existing optimal transport solvers without high computational cost.
Applicability to practical use cases involving distribution shifts in machine learning.
Optional kernelization allows handling of nonlinear distribution shifts.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be extended to incorporate higher-order moments or other statistics of displacements if second moments prove insufficient.
In settings with limited displacement observations, regularization techniques might be needed to stabilize the Mahalanobis estimate.
This reshaping idea might apply to other transport-based methods beyond standard OT solvers.
Testing the approach on tasks with downstream performance metrics like classification under shift could reveal broader benefits.

Load-bearing premise

The second moments of observed sample displacements accurately reflect the true underlying geometry of the distribution shift without significant corruption from noise or biases.

What would settle it

Observing that ReshapeOT produces less reliable transports than standard OT on data where displacements are known to be noisy or unrepresentative would falsify the benefit of the reshaping approach.

Figures

Figures reproduced from arXiv: 2605.04965 by Gr\'egoire Montavon, Jacob Kauffmann, Klaus-Robert M\"uller, Philip Naumann.

**Figure 1.** Figure 1: Overview of our approach to influence the OT solution through ground-truth displacements. The classical OT solution (here with squared Euclidean costs) contains points that are spuriously transported across the manifold. In contrast, our proposed ReshapeOT method carves the original Euclidean distance into a new distance with lower associated costs along the displacements. This results in a more reliable, … view at source ↗

**Figure 2.** Figure 2: (a) Training observations (gray) from multiple bird tracking studies and manually selected displacements (red). (b) Coupling of classical OT from a squared Euclidean cost matrix built on Cartesian coordinates. (c) Coupling of ReshapeOT with RBF kernel (λ=1, α=103 ) on Cartesian coordinates. The insets show the induced square-root (for better visibility) cost fields of classical OT and ReshapeOT for a give… view at source ↗

**Figure 3.** Figure 3: Comparison of different OT-based domain adaptation methods on the Rotating Moons task at 40◦ rotation. Arrows denote the transport to the barycentric mapping, as determined by the coupling for each method and setting. The experiment uses 150 samples per class and domain, and Ne = 40 randomly selected ground-truth displacements. Classification errors at various degrees of target rotation are shown in view at source ↗

read the original abstract

Optimal transport (OT) is a central framework for modeling distribution shifts. Because OT compares distributions directly in input space, a well-designed ground metric between observations is essential to ensure that the optimizer does not violate the true geometry of change. We propose Displacement-Reshaped Optimal Transport (ReshapeOT), a method that reshapes the ground metric by integrating observed sample displacements as an additional source of knowledge. Technically, ReshapeOT replaces the Euclidean metric with a Mahalanobis distance estimated from displacement second moments. This effectively carves expressways through the input space, inviting transport solutions that better align with observed displacements. Our method is computationally lightweight, integrates seamlessly into any OT solver that operates on a cost matrix, and can be kernelized for further flexibility. Experiments on synthetic and real-world data show that ReshapeOT achieves substantial gains in transport reliability. We further demonstrate our method's usefulness in two practical use cases.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ReshapeOT adds a simple Mahalanobis reshape to the OT cost matrix using displacement second moments, but the reliability gains hinge on those moments being clean and representative.

read the letter

The paper introduces ReshapeOT, which takes observed sample displacements and uses their second moments to define a Mahalanobis distance that replaces the usual Euclidean ground metric in optimal transport. This is meant to make the transport plans respect the actual geometry of the distribution shift better. What stands out is how straightforward the change is. You compute the covariance from the displacements once, plug it into the cost matrix, and any standard OT solver can use it. The kernelized version adds flexibility without much extra work. That practicality is a plus for applied work in domain adaptation. The main concern is whether the sample second moments actually capture the shift geometry without bias or noise. The method assumes the displacements are an independent, clean source of knowledge, but if they come from the same data or have measurement issues, the estimated metric could amplify errors. The abstract mentions no regularization or robustness steps, and the experiments are described only at a high level with no mention of specific baselines or statistical checks. Without those details, the substantial gains are hard to evaluate. This paper would interest researchers already using OT for shift modeling who can collect or have displacement data. It might be worth a read for ideas on metric adaptation, but the central claim needs stronger experimental support to be convincing. I would recommend sending it for peer review so the authors can clarify the data collection process and show more controlled results.

Referee Report

2 major / 2 minor

Summary. The paper introduces Displacement-Reshaped Optimal Transport (ReshapeOT), which modifies standard optimal transport by replacing the Euclidean ground metric with a Mahalanobis distance whose covariance is estimated from the second moments of observed sample displacements. This reshaping is intended to incorporate displacement information as an additional knowledge source, carving preferred transport routes that better respect the geometry of distribution shifts. The method is presented as computationally lightweight, compatible with any cost-matrix OT solver, and extendable via kernelization. Experiments on synthetic and real-world data are claimed to yield substantial improvements in transport reliability, with further illustrations in two practical applications.

Significance. If the second-moment estimate of displacements proves to be an unbiased and sufficient representation of shift geometry, ReshapeOT would provide a lightweight, plug-in enhancement to OT pipelines for modeling distribution shifts. The seamless integration with existing solvers and the kernelization option are practical strengths. The approach could be particularly useful in settings where displacement observations are readily available, potentially improving reliability without requiring entirely new OT formulations.

major comments (2)

Abstract and method description: The central claim that ReshapeOT yields substantially more reliable transport plans rests on the assumption that the sample covariance of observed displacements faithfully encodes the true shift geometry. However, the provided description supplies no details on displacement collection, regularization or shrinkage of the sample covariance, or robustness to measurement noise and selection effects. If these moments are corrupted, the induced Mahalanobis metric can systematically bias transport routes, directly undermining the reliability improvement.
Experimental claims (abstract): The assertion of 'substantial gains' in transport reliability is load-bearing for the paper's contribution, yet the abstract supplies no information on baselines, statistical tests, data splits, or controls for post-hoc choices. Without these, it is impossible to assess whether the reported improvements are robust or attributable to the reshaping.

minor comments (2)

The abstract would benefit from a brief statement of the two practical use cases to clarify the method's scope.
Notation for the Mahalanobis matrix and its estimation from displacement second moments should be introduced with an explicit equation in the method section for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and positive review of our work on Displacement-Reshaped Optimal Transport (ReshapeOT). We address each major comment point-by-point below, indicating revisions that will be incorporated to improve clarity and rigor.

read point-by-point responses

Referee: [—] Abstract and method description: The central claim that ReshapeOT yields substantially more reliable transport plans rests on the assumption that the sample covariance of observed displacements faithfully encodes the true shift geometry. However, the provided description supplies no details on displacement collection, regularization or shrinkage of the sample covariance, or robustness to measurement noise and selection effects. If these moments are corrupted, the induced Mahalanobis metric can systematically bias transport routes, directly undermining the reliability improvement.

Authors: We agree that the manuscript would benefit from expanded details on these practical aspects to strengthen the central claim. While Section 3 formally defines the Mahalanobis reshaping via the empirical second-moment matrix of displacements, we will revise the method section to explicitly describe: (i) displacement collection procedures (e.g., from paired source-target observations or domain-informed matching), (ii) application of a shrinkage estimator such as Ledoit-Wolf to regularize the sample covariance for positive-definiteness and noise robustness, and (iii) an added discussion subsection analyzing sensitivity to measurement noise and selection bias, supported by additional synthetic experiments with controlled corruption levels. These changes will clarify assumptions and address potential biases without altering the core algorithm. revision: yes
Referee: [—] Experimental claims (abstract): The assertion of 'substantial gains' in transport reliability is load-bearing for the paper's contribution, yet the abstract supplies no information on baselines, statistical tests, data splits, or controls for post-hoc choices. Without these, it is impossible to assess whether the reported improvements are robust or attributable to the reshaping.

Authors: We acknowledge the abstract's brevity limits evaluation of the experimental claims. In the revision, we will update the abstract to reference the primary baselines (standard Euclidean OT, entropic OT variants), note that reliability gains are assessed via repeated trials with statistical significance (paired t-tests across seeds), and indicate use of standard data partitioning (e.g., cross-validation splits on real-world shift datasets). Full details on controls, post-hoc analyses, and robustness checks remain in Section 5, but the abstract will now provide sufficient context. The full experiments already demonstrate consistent improvements across synthetic and real data; these clarifications will make that evidence more transparent. revision: yes

Circularity Check

1 steps flagged

Mahalanobis metric estimated from displacement second moments makes alignment gains tautological

specific steps

fitted input called prediction [Abstract]
"ReshapeOT replaces the Euclidean metric with a Mahalanobis distance estimated from displacement second moments. This effectively carves expressways through the input space, inviting transport solutions that better align with observed displacements. Experiments on synthetic and real-world data show that ReshapeOT achieves substantial gains in transport reliability."

The Mahalanobis matrix is computed from the second moments of the identical displacement samples later used to measure 'transport reliability.' Consequently the OT optimizer is guaranteed to produce plans that align better with those samples once the metric has been fitted to them; the reported gains are a direct statistical consequence of the estimation step rather than an external validation.

full rationale

The paper's core technical step estimates the Mahalanobis covariance directly from the second moments of the observed sample displacements and then uses the resulting metric inside OT. The headline experimental claim of 'substantial gains in transport reliability' is evaluated on the same displacements, so improved alignment follows by construction from the fitting procedure rather than from an independent test of whether the estimated geometry reflects the true shift. No self-citations, uniqueness theorems, or ansatzes from prior work are load-bearing; the circularity is limited to the data-dependence of the metric estimation itself. The method remains a coherent modeling choice but the validation of its reliability benefit reduces to the input.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that observed displacements encode the relevant geometry of change and on the standard OT assumption that a cost matrix can be supplied to any solver. No new entities are postulated.

free parameters (1)

Mahalanobis matrix from displacement second moments
Estimated directly from the observed displacements and used to define the reshaped ground metric.

axioms (1)

domain assumption Observed sample displacements reflect the true underlying geometry of the distribution shift
Invoked to justify replacing the Euclidean metric with the data-derived Mahalanobis metric.

pith-pipeline@v0.9.0 · 5465 in / 1123 out tokens · 34458 ms · 2026-05-08T17:25:46.809160+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages

[1]

Covariate Shift Adaptation by Importance Weighted Cross Validation.J

Sugiyama, M.; Krauledat, M.; Müller, K.R. Covariate Shift Adaptation by Importance Weighted Cross Validation.J. Mach. Learn. Res.2007,8, 985–1005

work page 2007
[2]

Quionero-Candela, J.; Sugiyama, M.; Schwaighofer, A.; Lawrence, N.D.Dataset Shift in Machine Learning; The MIT Press, 2009

work page 2009
[3]

Amari, S.I.Information Geometry and Its Applications, 1 ed.; Applied Mathematical Sciences, Springer: Tokyo, Japan, 2016

work page 2016
[4]

191, American Mathematical Soc., 2000

Amari, S.i.; Nagaoka, H.Methods of information geometry; Vol. 191, American Mathematical Soc., 2000

work page 2000
[5]

The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations2001,26, 101–174

Otto, F. The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations2001,26, 101–174

work page
[6]

Villani, C.Optimal Transport: Old and New; Grundlehren Der Mathematischen Wissenschaften, Springer Berlin Heidelberg, 2008

work page 2008
[7]

Computational Optimal Transport with Applications to Data Sciences.Foundations and Trends in Machine Learning2019,11, 355–607

Peyré, G.; Cuturi, M. Computational Optimal Transport with Applications to Data Sciences.Foundations and Trends in Machine Learning2019,11, 355–607

work page
[8]

Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell2019,176, 928–943

Schiebinger, G.; Shu, J.; Tabaka, M.; Cleary, B.; Subramanian, V .; Solomon, A.; Gould, J.; Liu, S.; Lin, S.; Berube, P .; et al. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell2019,176, 928–943

work page
[9]

Mapping Cells through Time and Space with Moscot.Nature2025,638, 1065–1075

Klein, D.; Palla, G.; Lange, M.; Klein, M.; Piran, Z.; Gander, M.; Meng-Papaxanthos, L.; Sterr, M.; Saber, L.; Jing, C.; et al. Mapping Cells through Time and Space with Moscot.Nature2025,638, 1065–1075

work page
[10]

Wasserstein training of restricted Boltzmann machines

Montavon, G.; Müller, K.R.; Cuturi, M. Wasserstein training of restricted Boltzmann machines. 2016, Vol. 29, Advances in Neural Information Processing Systems, pp. 3711–3719

work page 2016
[11]

Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence2017,39, 1853–1865

Courty, N.; Flamary, R.; Tuia, D.; Rakotomamonjy, A. Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence2017,39, 1853–1865

work page
[12]

Learning domain invariant representations by joint Wasserstein distance minimization.Neural Networks2023,167, 233–243

Andéol, L.; Kawakami, Y.; Wada, Y.; Kanamori, T.; Müller, K.; Montavon, G. Learning domain invariant representations by joint Wasserstein distance minimization.Neural Networks2023,167, 233–243

work page
[13]

Geodesic Sinkhorn For Fast and Accurate Optimal Transport on Manifolds

Huguet, G.; Tong, A.; Zapatero, M.R.; Tape, C.J.; Wolf, G.; Krishnaswamy, S. Geodesic Sinkhorn For Fast and Accurate Optimal Transport on Manifolds. In Proceedings of the MLSP. IEEE, 2023, pp. 1–6

work page 2023
[14]

Subspace Robust Wasserstein Distances

Paty, F.; Cuturi, M. Subspace Robust Wasserstein Distances. In Proceedings of the ICML. PMLR, 2019, Vol. 97,Proceedings of Machine Learning Research, pp. 5072–5081

work page 2019
[15]

Wasserstein Distances Made Explainable: Insights Into Dataset Shifts and Transport Phenomena.IEEE Transactions on Pattern Analysis and Machine Intelligence2026

Naumann, P .; Kauffmann, J.; Montavon, G. Wasserstein Distances Made Explainable: Insights Into Dataset Shifts and Transport Phenomena.IEEE Transactions on Pattern Analysis and Machine Intelligence2026. https://doi.org/10.1109/TPAMI.2026.3656947

work page doi:10.1109/tpami.2026.3656947 2026
[16]

Regularized Discrete Optimal Transport.SIAM J

Ferradans, S.; Papadakis, N.; Peyré, G.; Aujol, J. Regularized Discrete Optimal Transport.SIAM J. Imaging Sci.2014,7, 1853–1882

work page 2014
[17]

Feedback Schrödinger Bridge Matching

Theodoropoulos, P .; Komianos, N.; Pacelli, V .; Liu, G.; Theodorou, E.A. Feedback Schrödinger Bridge Matching. In Proceedings of the ICLR. OpenReview.net, 2025

work page 2025
[18]

Making transport more robust and interpretable by moving data through a small number of anchor points

Lin, C.; Azabou, M.; Dyer, E.L. Making transport more robust and interpretable by moving data through a small number of anchor points. In Proceedings of the ICML, 2021, Vol. 139,Proceedings of Machine Learning Research, pp. 6631–6641

work page 2021
[19]

Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces.CoRR2020,abs/2002.01615

Sato, R.; Cuturi, M.; Yamada, M.; Kashima, H. Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces.CoRR2020,abs/2002.01615

work page arXiv 2002
[20]

Keypoint-Guided Optimal Transport with Applications in Heterogeneous Domain Adaptation

Gu, X.; Yang, Y.; Zeng, W.; Sun, J.; Xu, Z. Keypoint-Guided Optimal Transport with Applications in Heterogeneous Domain Adaptation. 2022, Vol. 35,Advances in Neural Information Processing Systems, pp. 14972–14985

work page 2022
[21]

Ground metric learning.J

Cuturi, M.; Avis, D. Ground metric learning.J. Mach. Learn. Res.2014,15, 533–564

work page 2014
[22]

Metric Learning in Optimal Transport for Domain Adaptation

Kerdoncuff, T.; Emonet, R.; Sebban, M. Metric Learning in Optimal Transport for Domain Adaptation. In Proceedings of the IJCAI. ijcai.org, 2020, pp. 2162–2168

work page 2020
[23]

A Riemannian Approach to Ground Metric Learning for Optimal Transport

Jawanpuria, P .; Shi, D.; Mishra, B.; Gao, J. A Riemannian Approach to Ground Metric Learning for Optimal Transport. In Proceedings of the ICASSP. IEEE, 2025, pp. 1–5. 16 of 20

work page 2025
[24]

Neighbourhood Components Analysis

Goldberger, J.; Roweis, S.T.; Hinton, G.E.; Salakhutdinov, R. Neighbourhood Components Analysis. 2004, Vol. 17,Advances in Neural Information Processing Systems, pp. 513–520

work page 2004
[25]

Information-theoretic metric learning

Davis, J.V .; Kulis, B.; Jain, P .; Sra, S.; Dhillon, I.S. Information-theoretic metric learning. In Proceedings of the ICML. ACM, 2007, ACM International Conference Proceeding Series, pp. 209–216

work page 2007
[26]

Distance Metric Learning for Large Margin Nearest Neighbor Classification.J

Weinberger, K.Q.; Saul, L.K. Distance Metric Learning for Large Margin Nearest Neighbor Classification.J. Mach. Learn. Res.2009,10, 207–244

work page 2009
[27]

Learning a Mahalanobis Metric from Equivalence Constraints.J

Bar-Hillel, A.; Hertz, T.; Shental, N.; Weinshall, D. Learning a Mahalanobis Metric from Equivalence Constraints.J. Mach. Learn. Res.2005,6, 937–965

work page 2005
[28]

Ground Metric Learning on Graphs.J

Heitz, M.; Bonneel, N.; Coeurjolly, D.; Cuturi, M.; Peyré, G. Ground Metric Learning on Graphs.J. Math. Imaging Vis.2021,63, 89–107

work page 2021
[29]

Riemannian Metric Learning via Optimal Transport

Scarvelis, C.; Solomon, J. Riemannian Metric Learning via Optimal Transport. In Proceedings of the ICLR. OpenReview.net, 2023

work page 2023
[30]

Neural Optimal Transport with Lagrangian Costs

Pooladian, A.; Domingo-Enrich, C.; Chen, R.T.Q.; Amos, B. Neural Optimal Transport with Lagrangian Costs. In Proceedings of the UAI. PMLR, 2024, Proceedings of Machine Learning Research, pp. 2989–3003

work page 2024
[31]

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Kapusniak, K.; Potaptchik, P .; Reu, T.; Zhang, L.; Tong, A.; Bronstein, M.M.; Bose, A.J.; Giovanni, F.D. Metric Flow Matching for Smooth Interpolations on the Data Manifold. 2024, Vol. 37,Advances in Neural Information Processing Systems

work page 2024
[32]

Learning transport cost from subset correspondence

Liu, R.; Balsubramani, A.; Zou, J. Learning transport cost from subset correspondence. In Proceedings of the ICLR. OpenReview.net, 2020

work page 2020
[33]

Sinkhorn Distances: Lightspeed Computation of Optimal Transport

Cuturi, M. Sinkhorn Distances: Lightspeed Computation of Optimal Transport. 2013, Vol. 26,Advances in Neural Information Processing Systems, pp. 2292–2300

work page 2013
[34]

Mapping Estimation for Discrete Optimal Transport

Perrot, M.; Courty, N.; Flamary, R.; Habrard, A. Mapping Estimation for Discrete Optimal Transport. 2016, Vol. 29,Advances in Neural Information Processing Systems, pp. 4197–4205

work page 2016
[35]

Inverse Optimal Transport.SIAM J

Stuart, A.M.; Wolfram, M. Inverse Optimal Transport.SIAM J. Appl. Math.2020,80, 599–619

work page 2020
[36]

Sparsistency for inverse optimal transport

Andrade, F.; Peyré, G.; Poon, C. Sparsistency for inverse optimal transport. In Proceedings of the ICLR. OpenReview.net, 2024

work page 2024
[37]

Meta Optimal Transport

Amos, B.; Luise, G.; Cohen, S.; Redko, I. Meta Optimal Transport. In Proceedings of the ICML. PMLR, 2023, Proceedings of Machine Learning Research, pp. 791–813

work page 2023
[38]

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds

Bonet, C.; Drumetz, L.; Courty, N. Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds. J. Mach. Learn. Res.2025,26, 32:1–32:76

work page 2025
[39]

Nonlinear component analysis as a kernel eigenvalue problem.Neural computation1998,10, 1299–1319

Schölkopf, B.; Smola, A.; Müller, K.R. Nonlinear component analysis as a kernel eigenvalue problem.Neural computation1998,10, 1299–1319

work page
[40]

An introduction to kernel-based learning algorithms.IEEE Trans

Müller, K.R.; Mika, S.; Rätsch, G.; Tsuda, K.; Schölkopf, B. An introduction to kernel-based learning algorithms.IEEE Trans. Neural Networks2001,12, 181–201

work page
[41]

Schölkopf, B.; Smola, A.J.Learning with kernels; Adaptive Computation and Machine Learning series, MIT Press: London, England, 2002

work page 2002
[42]

Optimal Transport in Reproducing Kernel Hilbert Spaces: Theory and Applications.IEEE Trans

Zhang, Z.; Wang, M.; Nehorai, A. Optimal Transport in Reproducing Kernel Hilbert Spaces: Theory and Applications.IEEE Trans. Pattern Anal. Mach. Intell.2020,42, 1741–1754

work page 2020
[43]

Air Quality

Vito, S. Air Quality. UCI Machine Learning Repository, 2008. https://doi.org/10.24432/C59K5F

work page doi:10.24432/c59k5f 2008
[44]

Appliances Energy Prediction

Candanedo, L. Appliances Energy Prediction. UCI Machine Learning Repository, 2017. https://doi.org/10 .24432/C5VC8G

work page 2017
[45]

The Movebank system for studying global animal movement and demography.Methods in Ecology and Evolution2022,13, 419–431

Kays, R.; Davidson, S.C.; Berger, M.; Bohrer, G.; Fiedler, W.; Flack, A.; Hirt, J.; Hahn, C.; Gauggel, D.; Russell, B.; et al. The Movebank system for studying global animal movement and demography.Methods in Ecology and Evolution2022,13, 419–431

work page
[46]

Unsupervised Domain Adaptation by Backpropagation

Ganin, Y.; Lempitsky, V .S. Unsupervised Domain Adaptation by Backpropagation. In Proceedings of the ICML. JMLR.org, 2015, JMLR Workshop and Conference Proceedings, pp. 1180–1189

work page 2015
[47]

Kömen, E

Kömen, J.; de Jong, E.D.; Hense, J.; Marienwald, H.; Dippel, J.; Naumann, P .; Marcus, E.; Ruff, L.; Alber, M.; Teuwen, J.; et al. Towards Robust Foundation Models for Digital Pathology.CoRR2025,abs/2507.17845

work page arXiv
[48]

Partial Optimal Tranport with Applications on Positive-Unlabeled Learning

Chapel, L.; Alaya, M.Z.; Gasso, G. Partial Optimal Tranport with Applications on Positive-Unlabeled Learning. 2020, Vol. 33,Advances in Neural Information Processing Systems, pp. 2903–2913

work page 2020
[49]

Information geometry connecting Wasserstein distance and Kull- back–Leibler divergence via the entropy-relaxed transportation problem.Information Geometry2018,1, 13–37

Amari, S.i.; Karakida, R.; Oizumi, M. Information geometry connecting Wasserstein distance and Kull- back–Leibler divergence via the entropy-relaxed transportation problem.Information Geometry2018,1, 13–37

work page
[50]

Information geometry of the Otto metric.Information Geometry2024,8, 209–232

Ay, N. Information geometry of the Otto metric.Information Geometry2024,8, 209–232. 17 of 20 SUPPLEMENTARYNOTES Supplementary Note A Time-series Dataset Details TheAir Quality[ 43] andAppliances[ 44] time-series datasets used in Section 4 were preprocessed in line with [15], with the following main changes: • We use a RobustScaler (based on the median and...

work page doi:10.5441/001/1.3hp3s250 2012

[1] [1]

Covariate Shift Adaptation by Importance Weighted Cross Validation.J

Sugiyama, M.; Krauledat, M.; Müller, K.R. Covariate Shift Adaptation by Importance Weighted Cross Validation.J. Mach. Learn. Res.2007,8, 985–1005

work page 2007

[2] [2]

Quionero-Candela, J.; Sugiyama, M.; Schwaighofer, A.; Lawrence, N.D.Dataset Shift in Machine Learning; The MIT Press, 2009

work page 2009

[3] [3]

Amari, S.I.Information Geometry and Its Applications, 1 ed.; Applied Mathematical Sciences, Springer: Tokyo, Japan, 2016

work page 2016

[4] [4]

191, American Mathematical Soc., 2000

Amari, S.i.; Nagaoka, H.Methods of information geometry; Vol. 191, American Mathematical Soc., 2000

work page 2000

[5] [5]

The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations2001,26, 101–174

Otto, F. The geometry of dissipative evolution equations: the porous medium equation.Communications in Partial Differential Equations2001,26, 101–174

work page

[6] [6]

Villani, C.Optimal Transport: Old and New; Grundlehren Der Mathematischen Wissenschaften, Springer Berlin Heidelberg, 2008

work page 2008

[7] [7]

Computational Optimal Transport with Applications to Data Sciences.Foundations and Trends in Machine Learning2019,11, 355–607

Peyré, G.; Cuturi, M. Computational Optimal Transport with Applications to Data Sciences.Foundations and Trends in Machine Learning2019,11, 355–607

work page

[8] [8]

Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell2019,176, 928–943

Schiebinger, G.; Shu, J.; Tabaka, M.; Cleary, B.; Subramanian, V .; Solomon, A.; Gould, J.; Liu, S.; Lin, S.; Berube, P .; et al. Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming.Cell2019,176, 928–943

work page

[9] [9]

Mapping Cells through Time and Space with Moscot.Nature2025,638, 1065–1075

Klein, D.; Palla, G.; Lange, M.; Klein, M.; Piran, Z.; Gander, M.; Meng-Papaxanthos, L.; Sterr, M.; Saber, L.; Jing, C.; et al. Mapping Cells through Time and Space with Moscot.Nature2025,638, 1065–1075

work page

[10] [10]

Wasserstein training of restricted Boltzmann machines

Montavon, G.; Müller, K.R.; Cuturi, M. Wasserstein training of restricted Boltzmann machines. 2016, Vol. 29, Advances in Neural Information Processing Systems, pp. 3711–3719

work page 2016

[11] [11]

Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence2017,39, 1853–1865

Courty, N.; Flamary, R.; Tuia, D.; Rakotomamonjy, A. Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence2017,39, 1853–1865

work page

[12] [12]

Learning domain invariant representations by joint Wasserstein distance minimization.Neural Networks2023,167, 233–243

Andéol, L.; Kawakami, Y.; Wada, Y.; Kanamori, T.; Müller, K.; Montavon, G. Learning domain invariant representations by joint Wasserstein distance minimization.Neural Networks2023,167, 233–243

work page

[13] [13]

Geodesic Sinkhorn For Fast and Accurate Optimal Transport on Manifolds

Huguet, G.; Tong, A.; Zapatero, M.R.; Tape, C.J.; Wolf, G.; Krishnaswamy, S. Geodesic Sinkhorn For Fast and Accurate Optimal Transport on Manifolds. In Proceedings of the MLSP. IEEE, 2023, pp. 1–6

work page 2023

[14] [14]

Subspace Robust Wasserstein Distances

Paty, F.; Cuturi, M. Subspace Robust Wasserstein Distances. In Proceedings of the ICML. PMLR, 2019, Vol. 97,Proceedings of Machine Learning Research, pp. 5072–5081

work page 2019

[15] [15]

Wasserstein Distances Made Explainable: Insights Into Dataset Shifts and Transport Phenomena.IEEE Transactions on Pattern Analysis and Machine Intelligence2026

Naumann, P .; Kauffmann, J.; Montavon, G. Wasserstein Distances Made Explainable: Insights Into Dataset Shifts and Transport Phenomena.IEEE Transactions on Pattern Analysis and Machine Intelligence2026. https://doi.org/10.1109/TPAMI.2026.3656947

work page doi:10.1109/tpami.2026.3656947 2026

[16] [16]

Regularized Discrete Optimal Transport.SIAM J

Ferradans, S.; Papadakis, N.; Peyré, G.; Aujol, J. Regularized Discrete Optimal Transport.SIAM J. Imaging Sci.2014,7, 1853–1882

work page 2014

[17] [17]

Feedback Schrödinger Bridge Matching

Theodoropoulos, P .; Komianos, N.; Pacelli, V .; Liu, G.; Theodorou, E.A. Feedback Schrödinger Bridge Matching. In Proceedings of the ICLR. OpenReview.net, 2025

work page 2025

[18] [18]

Making transport more robust and interpretable by moving data through a small number of anchor points

Lin, C.; Azabou, M.; Dyer, E.L. Making transport more robust and interpretable by moving data through a small number of anchor points. In Proceedings of the ICML, 2021, Vol. 139,Proceedings of Machine Learning Research, pp. 6631–6641

work page 2021

[19] [19]

Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces.CoRR2020,abs/2002.01615

Sato, R.; Cuturi, M.; Yamada, M.; Kashima, H. Fast and Robust Comparison of Probability Measures in Heterogeneous Spaces.CoRR2020,abs/2002.01615

work page arXiv 2002

[20] [20]

Keypoint-Guided Optimal Transport with Applications in Heterogeneous Domain Adaptation

Gu, X.; Yang, Y.; Zeng, W.; Sun, J.; Xu, Z. Keypoint-Guided Optimal Transport with Applications in Heterogeneous Domain Adaptation. 2022, Vol. 35,Advances in Neural Information Processing Systems, pp. 14972–14985

work page 2022

[21] [21]

Ground metric learning.J

Cuturi, M.; Avis, D. Ground metric learning.J. Mach. Learn. Res.2014,15, 533–564

work page 2014

[22] [22]

Metric Learning in Optimal Transport for Domain Adaptation

Kerdoncuff, T.; Emonet, R.; Sebban, M. Metric Learning in Optimal Transport for Domain Adaptation. In Proceedings of the IJCAI. ijcai.org, 2020, pp. 2162–2168

work page 2020

[23] [23]

A Riemannian Approach to Ground Metric Learning for Optimal Transport

Jawanpuria, P .; Shi, D.; Mishra, B.; Gao, J. A Riemannian Approach to Ground Metric Learning for Optimal Transport. In Proceedings of the ICASSP. IEEE, 2025, pp. 1–5. 16 of 20

work page 2025

[24] [24]

Neighbourhood Components Analysis

Goldberger, J.; Roweis, S.T.; Hinton, G.E.; Salakhutdinov, R. Neighbourhood Components Analysis. 2004, Vol. 17,Advances in Neural Information Processing Systems, pp. 513–520

work page 2004

[25] [25]

Information-theoretic metric learning

Davis, J.V .; Kulis, B.; Jain, P .; Sra, S.; Dhillon, I.S. Information-theoretic metric learning. In Proceedings of the ICML. ACM, 2007, ACM International Conference Proceeding Series, pp. 209–216

work page 2007

[26] [26]

Distance Metric Learning for Large Margin Nearest Neighbor Classification.J

Weinberger, K.Q.; Saul, L.K. Distance Metric Learning for Large Margin Nearest Neighbor Classification.J. Mach. Learn. Res.2009,10, 207–244

work page 2009

[27] [27]

Learning a Mahalanobis Metric from Equivalence Constraints.J

Bar-Hillel, A.; Hertz, T.; Shental, N.; Weinshall, D. Learning a Mahalanobis Metric from Equivalence Constraints.J. Mach. Learn. Res.2005,6, 937–965

work page 2005

[28] [28]

Ground Metric Learning on Graphs.J

Heitz, M.; Bonneel, N.; Coeurjolly, D.; Cuturi, M.; Peyré, G. Ground Metric Learning on Graphs.J. Math. Imaging Vis.2021,63, 89–107

work page 2021

[29] [29]

Riemannian Metric Learning via Optimal Transport

Scarvelis, C.; Solomon, J. Riemannian Metric Learning via Optimal Transport. In Proceedings of the ICLR. OpenReview.net, 2023

work page 2023

[30] [30]

Neural Optimal Transport with Lagrangian Costs

Pooladian, A.; Domingo-Enrich, C.; Chen, R.T.Q.; Amos, B. Neural Optimal Transport with Lagrangian Costs. In Proceedings of the UAI. PMLR, 2024, Proceedings of Machine Learning Research, pp. 2989–3003

work page 2024

[31] [31]

Metric Flow Matching for Smooth Interpolations on the Data Manifold

Kapusniak, K.; Potaptchik, P .; Reu, T.; Zhang, L.; Tong, A.; Bronstein, M.M.; Bose, A.J.; Giovanni, F.D. Metric Flow Matching for Smooth Interpolations on the Data Manifold. 2024, Vol. 37,Advances in Neural Information Processing Systems

work page 2024

[32] [32]

Learning transport cost from subset correspondence

Liu, R.; Balsubramani, A.; Zou, J. Learning transport cost from subset correspondence. In Proceedings of the ICLR. OpenReview.net, 2020

work page 2020

[33] [33]

Sinkhorn Distances: Lightspeed Computation of Optimal Transport

Cuturi, M. Sinkhorn Distances: Lightspeed Computation of Optimal Transport. 2013, Vol. 26,Advances in Neural Information Processing Systems, pp. 2292–2300

work page 2013

[34] [34]

Mapping Estimation for Discrete Optimal Transport

Perrot, M.; Courty, N.; Flamary, R.; Habrard, A. Mapping Estimation for Discrete Optimal Transport. 2016, Vol. 29,Advances in Neural Information Processing Systems, pp. 4197–4205

work page 2016

[35] [35]

Inverse Optimal Transport.SIAM J

Stuart, A.M.; Wolfram, M. Inverse Optimal Transport.SIAM J. Appl. Math.2020,80, 599–619

work page 2020

[36] [36]

Sparsistency for inverse optimal transport

Andrade, F.; Peyré, G.; Poon, C. Sparsistency for inverse optimal transport. In Proceedings of the ICLR. OpenReview.net, 2024

work page 2024

[37] [37]

Meta Optimal Transport

Amos, B.; Luise, G.; Cohen, S.; Redko, I. Meta Optimal Transport. In Proceedings of the ICML. PMLR, 2023, Proceedings of Machine Learning Research, pp. 791–813

work page 2023

[38] [38]

Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds

Bonet, C.; Drumetz, L.; Courty, N. Sliced-Wasserstein Distances and Flows on Cartan-Hadamard Manifolds. J. Mach. Learn. Res.2025,26, 32:1–32:76

work page 2025

[39] [39]

Nonlinear component analysis as a kernel eigenvalue problem.Neural computation1998,10, 1299–1319

Schölkopf, B.; Smola, A.; Müller, K.R. Nonlinear component analysis as a kernel eigenvalue problem.Neural computation1998,10, 1299–1319

work page

[40] [40]

An introduction to kernel-based learning algorithms.IEEE Trans

Müller, K.R.; Mika, S.; Rätsch, G.; Tsuda, K.; Schölkopf, B. An introduction to kernel-based learning algorithms.IEEE Trans. Neural Networks2001,12, 181–201

work page

[41] [41]

Schölkopf, B.; Smola, A.J.Learning with kernels; Adaptive Computation and Machine Learning series, MIT Press: London, England, 2002

work page 2002

[42] [42]

Optimal Transport in Reproducing Kernel Hilbert Spaces: Theory and Applications.IEEE Trans

Zhang, Z.; Wang, M.; Nehorai, A. Optimal Transport in Reproducing Kernel Hilbert Spaces: Theory and Applications.IEEE Trans. Pattern Anal. Mach. Intell.2020,42, 1741–1754

work page 2020

[43] [43]

Air Quality

Vito, S. Air Quality. UCI Machine Learning Repository, 2008. https://doi.org/10.24432/C59K5F

work page doi:10.24432/c59k5f 2008

[44] [44]

Appliances Energy Prediction

Candanedo, L. Appliances Energy Prediction. UCI Machine Learning Repository, 2017. https://doi.org/10 .24432/C5VC8G

work page 2017

[45] [45]

The Movebank system for studying global animal movement and demography.Methods in Ecology and Evolution2022,13, 419–431

Kays, R.; Davidson, S.C.; Berger, M.; Bohrer, G.; Fiedler, W.; Flack, A.; Hirt, J.; Hahn, C.; Gauggel, D.; Russell, B.; et al. The Movebank system for studying global animal movement and demography.Methods in Ecology and Evolution2022,13, 419–431

work page

[46] [46]

Unsupervised Domain Adaptation by Backpropagation

Ganin, Y.; Lempitsky, V .S. Unsupervised Domain Adaptation by Backpropagation. In Proceedings of the ICML. JMLR.org, 2015, JMLR Workshop and Conference Proceedings, pp. 1180–1189

work page 2015

[47] [47]

Kömen, E

Kömen, J.; de Jong, E.D.; Hense, J.; Marienwald, H.; Dippel, J.; Naumann, P .; Marcus, E.; Ruff, L.; Alber, M.; Teuwen, J.; et al. Towards Robust Foundation Models for Digital Pathology.CoRR2025,abs/2507.17845

work page arXiv

[48] [48]

Partial Optimal Tranport with Applications on Positive-Unlabeled Learning

Chapel, L.; Alaya, M.Z.; Gasso, G. Partial Optimal Tranport with Applications on Positive-Unlabeled Learning. 2020, Vol. 33,Advances in Neural Information Processing Systems, pp. 2903–2913

work page 2020

[49] [49]

Information geometry connecting Wasserstein distance and Kull- back–Leibler divergence via the entropy-relaxed transportation problem.Information Geometry2018,1, 13–37

Amari, S.i.; Karakida, R.; Oizumi, M. Information geometry connecting Wasserstein distance and Kull- back–Leibler divergence via the entropy-relaxed transportation problem.Information Geometry2018,1, 13–37

work page

[50] [50]

Information geometry of the Otto metric.Information Geometry2024,8, 209–232

Ay, N. Information geometry of the Otto metric.Information Geometry2024,8, 209–232. 17 of 20 SUPPLEMENTARYNOTES Supplementary Note A Time-series Dataset Details TheAir Quality[ 43] andAppliances[ 44] time-series datasets used in Section 4 were preprocessed in line with [15], with the following main changes: • We use a RobustScaler (based on the median and...

work page doi:10.5441/001/1.3hp3s250 2012