pith. sign in

arxiv: 2605.18389 · v2 · pith:TGGAWKZ7new · submitted 2026-05-18 · 💻 cs.LG · math.OC

Spherical Harmonic Optimal Transport: Application to Climate Models Comparisons

Pith reviewed 2026-05-20 12:53 UTC · model grok-4.3

classification 💻 cs.LG math.OC
keywords optimal transportheat kernelspherical harmonicsSinkhorn divergenceclimate modelsmanifold learningunbalanced transportcomputational efficiency
0
0 comments X

The pith

Heat kernel costs converge to optimal transport costs as diffusion time vanishes on manifolds, yielding a fast spherical harmonic Sinkhorn algorithm on the sphere.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper establishes that the heat kernel cost converges to the optimal transport cost as time vanishes in both balanced and unbalanced cases on manifolds. On the 2-sphere, the resulting Sinkhorn divergences preserve the geometric and analytic properties of classical optimal transport. The authors exploit the sphere's harmonic structure to obtain a fast algorithm needing only linear memory and cubic-root scaling in time per iteration, with dense operations suited to GPUs. The method is validated on synthetic data and used to compare global climate models, yielding spatial and seasonal performance insights.

Core claim

The heat kernel cost converges to the optimal transport cost as time vanishes in the balanced and unbalanced cases. In the specific case of the 2-sphere S^2, the associated Sinkhorn divergences retain the desirable geometric and analytic properties of classical optimal transport discrepancies. A fast Sinkhorn algorithm is derived requiring only O(n) memory and O(n^{3/2}) time per iteration with fully dense GPU-friendly operations.

What carries the argument

The heat kernel cost on the manifold, approximated via spherical harmonic truncation to enable fast convolution-based Sinkhorn iterations that preserve metric and positivity properties on S^2.

Load-bearing premise

The heat kernel approximation on the manifold converges to the true optimal transport cost as diffusion time approaches zero, and spherical harmonic truncation preserves the metric and positivity properties needed for Sinkhorn divergences on the sphere.

What would settle it

A computation on the sphere showing that the difference between the heat kernel Sinkhorn divergence and exact optimal transport cost fails to approach zero as the diffusion time parameter is driven toward zero for simple test measures.

Figures

Figures reproduced from arXiv: 2605.18389 by Iskander Legheraba, L\'eo Buecher, Nicolas Courty, Pierre Hou\'edry.

Figure 1
Figure 1. Figure 1: Mean runtime (seconds) vs. grid resolution averaged over 5 runs; missing values indicate [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Model rankings from SD. Left: Bal￾anced vs. unbalanced Sinkhorn divergences (DJF). Right: DJF vs. JJA unbalanced rank￾ings. Interestingly, rankings exhibit severe changes be￾tween the balanced and unbalanced cases [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: DJF−JJA difference of regional OT gradient norms under the unbalanced Sinkhorn divergence. Red: larger bias in DJF; blue: larger bias in JJA. 4.2.3 Arctic Bias and Sea-Ice Coupling The seasonal collapse of the Arctic gradient norm, from a strongly discriminating signal in DJF to near-zero in JJA across all models, points to a specific physical mechanism: Arctic precipitation in boreal winter is tightly cou… view at source ↗
Figure 4
Figure 4. Figure 4: OT gradient of the Sinkhorn divergence (centered) over the Arctic for ACCESS-CM2 (top) [PITH_FULL_IMAGE:figures/full_fig_p009_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Clenshaw-Curtis, Gauss-Legendre and Lobbato discretization schemes for varying [PITH_FULL_IMAGE:figures/full_fig_p022_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: HEALPix discretization scheme for varying [PITH_FULL_IMAGE:figures/full_fig_p023_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Runtime comparison across methods, grid types, and regularization parameters [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Mean absolute error between heat and matrix convolutions as a function of [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Mollweide projections of the heat (left column) and matrix (right column) convolutions of [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Mollweide projections of the heat (left column) and matrix (right column) convolutions [PITH_FULL_IMAGE:figures/full_fig_p028_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Illustrations of the precipitation distributions for the ERA5 reanalysis (top) and the [PITH_FULL_IMAGE:figures/full_fig_p029_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Regional OT gradient norms for eight CMIP6 models in DJF (up) and JJA (down), [PITH_FULL_IMAGE:figures/full_fig_p030_12.png] view at source ↗
read the original abstract

Optimal transport provides a powerful framework for comparing measures while respecting the geometry of their support, but comes with an expensive computational cost, hindering its potential application to real world use cases. On manifolds, convolutional algorithms based on the heat kernel have been proposed to alleviate this cost, but their theoretical properties remain largely unexplored. We establish that the heat kernel cost converges to the optimal transport cost as time vanishes in the balanced and unbalanced cases. In the specific case of the 2-sphere $\mathbb{S}^2$, we ensure that the associated Sinkhorn divergences retains the desirable geometric and analytic properties of classical optimal transport discrepancies. Moreover, we leverage the harmonic structure of the sphere to derive a fast Sinkhorn algorithm, requiring only $\mathcal{O}(n)$ memory and $\mathcal{O}(n^{3/2})$ time per iteration, with fully dense GPU-friendly operations. We validate its computational efficiency on synthetic data, and discuss its potential use in the evaluation of global climate models, providing both spatial and seasonal insights into models performances.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that a heat-kernel cost on manifolds converges to the optimal transport cost as diffusion time t vanishes (balanced and unbalanced cases). On the 2-sphere it asserts that the associated Sinkhorn divergences retain the geometric and analytic properties of classical OT, and derives a fast Sinkhorn algorithm via spherical-harmonic truncation of the heat kernel that uses O(n) memory and O(n^{3/2}) time per iteration with dense GPU-friendly operations. The method is validated on synthetic data and applied to spatial/seasonal comparisons of global climate models.

Significance. If the convergence result and the preservation of OT properties under the harmonic truncation both hold, the work would supply a computationally attractive, geometry-aware discrepancy for spherical data, with immediate relevance to climate-model intercomparison. The stated complexity improvement over dense OT solvers would be a practical contribution if the implementation and positivity guarantees are verified.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (convergence statement): the manuscript asserts that the heat-kernel cost converges to the OT cost as t→0 in both balanced and unbalanced settings, yet supplies neither a proof sketch, error bounds, nor reference to supporting lemmas; because this limit is the central theoretical justification for using the kernel as a proxy, the absence of a derivation is load-bearing.
  2. [§5] §5 (spherical-harmonic truncation): the partial sum p_t^L = ∑_{l=0}^L e^{-l(l+1)t} (2l+1)/(4π) P_l(cos θ) is a finite Fourier series and therefore subject to Gibbs oscillations; for moderate L and t>0 the truncated kernel can take negative values off the diagonal. Negative entries render the cost C_t = −t log p_t^L undefined over the reals and destroy the positivity and symmetry required for the Sinkhorn divergence to inherit the geometric properties asserted in the abstract.
minor comments (2)
  1. [§6] The synthetic-data validation is described only qualitatively; quantitative tables comparing runtime and accuracy against standard Sinkhorn or entropic OT baselines on the same point sets would strengthen the efficiency claims.
  2. [§4] Notation for the unbalanced case (e.g., the precise form of the marginal penalties) should be stated explicitly when the convergence result is extended beyond the balanced setting.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful and constructive review. The comments highlight important aspects of the theoretical justification and numerical stability that we will strengthen in the revision. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (convergence statement): the manuscript asserts that the heat-kernel cost converges to the OT cost as t→0 in both balanced and unbalanced settings, yet supplies neither a proof sketch, error bounds, nor reference to supporting lemmas; because this limit is the central theoretical justification for using the kernel as a proxy, the absence of a derivation is load-bearing.

    Authors: We agree that an explicit justification for the convergence of the heat-kernel cost to the OT cost as t→0 is essential. The manuscript claims this convergence in both balanced and unbalanced cases, but we acknowledge that a self-contained sketch or direct reference would make the argument more transparent. In the revised manuscript we will add a short derivation in §4 based on the well-known short-time asymptotic of the heat kernel on Riemannian manifolds (p_t(x,y) ∼ (4πt)^{-d/2} exp(−d_g(x,y)^2/(4t))), together with a citation to the relevant entropy-regularized OT literature. We will also include a qualitative discussion of the rate at which the approximation error vanishes. revision: yes

  2. Referee: [§5] §5 (spherical-harmonic truncation): the partial sum p_t^L = ∑_{l=0}^L e^{-l(l+1)t} (2l+1)/(4π) P_l(cos θ) is a finite Fourier series and therefore subject to Gibbs oscillations; for moderate L and t>0 the truncated kernel can take negative values off the diagonal. Negative entries render the cost C_t = −t log p_t^L undefined over the reals and destroy the positivity and symmetry required for the Sinkhorn divergence to inherit the geometric properties asserted in the abstract.

    Authors: We appreciate this observation on the possible negativity of the truncated kernel. The Gibbs phenomenon is indeed present for finite L. In the revised §5 we will (i) state explicit parameter regimes (L ≫ 1/√t) under which the truncated sum remains non-negative on the sphere, (ii) report numerical checks confirming positivity for the (L,t) pairs used in the climate-model experiments, and (iii) note that, when needed, a simple positive-part rectification can be applied without altering the O(n) memory or O(n^{3/2}) complexity. These additions will preserve the claimed geometric properties while keeping the algorithm practical. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation chain is self-contained

full rationale

The paper claims to establish convergence of the heat kernel cost to OT as t vanishes (balanced/unbalanced cases) and to derive an O(n^{3/2}) Sinkhorn algorithm on S^2 via spherical-harmonic truncation of the heat kernel. These steps are presented as independent analytic results relying on standard properties of the heat kernel and spherical harmonics rather than any reduction to fitted parameters, self-definitional loops, or load-bearing self-citations. No quoted equations or sections reduce a claimed prediction or uniqueness result to the paper's own inputs by construction. The central guarantees are therefore not forced by the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities stated. Ledger remains empty pending full text.

pith-pipeline@v0.9.0 · 5714 in / 1119 out tokens · 31670 ms · 2026-05-20T12:53:55.808308+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages · 1 internal anchor

  1. [1]

    Wasserstein Generative Adversarial Networks

    Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein Generative Adversarial Networks. InProceedings of the 34th International Conference on Machine Learning, pages 214–223. PMLR, July 2017

  2. [2]

    Spherical Sliced-Wasserstein

    Clément Bonet, Paul Berg, Nicolas Courty, François Septier, Lucas Drumetz, and Minh Tan Pham. Spherical Sliced-Wasserstein. InThe Eleventh International Conference on Learning Representations, September 2022

  3. [3]

    Slicing Unbalanced Optimal Transport.Transactions on Machine Learning Research, 2024

    Clément Bonet, Kimia Nadjahi, Thibault Séjourné, Kilian Fatras, and Nicolas Courty. Slicing Unbalanced Optimal Transport.Transactions on Machine Learning Research, 2024

  4. [4]

    Spherical fourier neural operators: Learning stable dynamics on the sphere, 2023

    Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, and Anima Anandkumar. Spherical fourier neural operators: Learning stable dynamics on the sphere, 2023

  5. [5]

    Cambridge University Press, 2023

    Nicolas Boumal.An introduction to optimization on smooth manifolds. Cambridge University Press, 2023

  6. [6]

    JAX: composable transformations of Python+NumPy programs, 2018

    James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Yash Katariya, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman- Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URLhttp://github.com/jax-ml/jax

  7. [7]

    Entropic regularisation of unbalanced optimal transporta- tion problems.arXiv preprint arXiv:2305.02410, 2023

    Maciej Buze and Manh Hong Duong. Entropic regularisation of unbalanced optimal transporta- tion problems.arXiv preprint arXiv:2305.02410, 2023

  8. [8]

    Convergence of entropic schemes for optimal transport and gradient flows.SIAM Journal on Mathematical Analysis, 49(2):1385–1418, 2017

    Guillaume Carlier, Vincent Duval, Gabriel Peyré, and Bernhard Schmitzer. Convergence of entropic schemes for optimal transport and gradient flows.SIAM Journal on Mathematical Analysis, 49(2):1385–1418, 2017

  9. [9]

    Convergence rate of general entropic optimal transport costs.Calculus of Variations and Partial Differential Equations, 62(4):116, 2023

    Guillaume Carlier, Paul Pegon, and Luca Tamanini. Convergence rate of general entropic optimal transport costs.Calculus of Variations and Partial Differential Equations, 62(4):116, 2023

  10. [10]

    Scaling algorithms for unbalanced optimal transport problems.Mathematics of Computation, 87(314): 2563–2609, 2018

    Lénaïc Chizat, Gabriel Peyré, Bernhard Schmitzer, and François-Xavier Vialard. Scaling algorithms for unbalanced optimal transport problems.Mathematics of Computation, 87(314): 2563–2609, 2018. doi: 10.1090/mcom/3303

  11. [11]

    C. W. Clenshaw and A. R. Curtis. A method for numerical integration on an automatic computer. Numerische Mathematik, 2:197–205, 1960

  12. [12]

    Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865, September 2017

    Nicolas Courty, Rémi Flamary, Devis Tuia, and Alain Rakotomamonjy. Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865, September 2017. ISSN 1939-3539. doi: 10.1109/TPAMI.2016.2615921

  13. [13]

    Sinkhorn distances: Lightspeed computation of optimal transport

    Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAdvances in Neural Information Processing Systems 26, pages 2292–2300. Curran Associates, Inc., 2013

  14. [14]

    Invariant kernels on riemannian symmetric spaces: a harmonic-analytic approach.SIAM Journal on Mathematics of Data Science, 7(2):752–776, 2025

    Nathaël Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega, and Salem Said. Invariant kernels on riemannian symmetric spaces: a harmonic-analytic approach.SIAM Journal on Mathematics of Data Science, 7(2):752–776, 2025

  15. [15]

    A fast and accurate algorithm for spherical harmonic analysis on healpix grids with applications to the cosmic microwave background radiation

    Kathryn P Drake and Grady B Wright. A fast and accurate algorithm for spherical harmonic analysis on healpix grids with applications to the cosmic microwave background radiation. Journal of Computational Physics, 416:109544, 2020. 10

  16. [16]

    J. R. Driscoll and D. M. Healy. Computing fourier transforms and convolutions on the 2-sphere. Advances in applied mathematics, 15(2):202–250, 1994

  17. [17]

    Lecture notes on symmetric spaces.preprint, 1997

    Jost-Hinrich Eschenburg and To Renato. Lecture notes on symmetric spaces.preprint, 1997

  18. [18]

    Eyring, S

    V . Eyring, S. Bony, G. A. Meehl, C. A. Senior, B. Stevens, R. J. Stouffer, and K. E. Taylor. Overview of the coupled model intercomparison project phase 6 (cmip6) experimental design and organization.Geoscientific Model Development, 9(5):1937–1958, 2016

  19. [19]

    J. Faraut. Analysis on lie groups.Cambridge studies in advanced mathematics, 110, 2008

  20. [20]

    Geodesic exponential kernels: When curvature and linearity conflict

    Aasa Feragen, Francois Lauze, and Soren Hauberg. Geodesic exponential kernels: When curvature and linearity conflict. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3032–3042, 2015

  21. [21]

    Interpolating between optimal transport and mmd using sinkhorn divergences

    Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouve, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In Kamalika Chaudhuri and Masashi Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceeding...

  22. [22]

    Alaya, Aurélie Boisbunon, Sylvain Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T

    Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Sylvain Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T. H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, ...

  23. [23]

    Validating climate models with spherical convolutional wasserstein distance.Advances in Neural Information Processing Systems, 37: 59119–59149, 2024

    Robert Garrett, Trevor Harris, Zhuo Wang, and Bo Li. Validating climate models with spherical convolutional wasserstein distance.Advances in Neural Information Processing Systems, 37: 59119–59149, 2024

  24. [24]

    Górski, Eric Hivon, A

    Krzysztof M. Górski, Eric Hivon, A. J. Banday, B. D. Wandelt, Frode K. Hansen, Mstvos Reinecke, and Matthias Bartelmann. HEALPix: A framework for high-resolution discretization and fast analysis of data distributed on the sphere.The Astrophysical Journal, 622(2):759–771, 2005

  25. [25]

    American Mathemat- ical Soc., 2009

    Alexander Grigoryan.Heat kernel and analysis on manifolds, volume 47. American Mathemat- ical Soc., 2009

  26. [26]

    Ffts for the 2-sphere-improvements and variations.Journal of Fourier analysis and applications, 9(4): 341–385, 2003

    Dennis M Healy Jr, Daniel N Rockmore, Peter J Kostelec, and Sean Moore. Ffts for the 2-sphere-improvements and variations.Journal of Fourier analysis and applications, 9(4): 341–385, 2003

  27. [27]

    Academic press, 1979

    Sigurdur Helgason.Differential geometry, Lie groups, and symmetric spaces, volume 80. Academic press, 1979

  28. [28]

    Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz- Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Balsamo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovanna De Chiara, Per Dahlgren, Dick Dee, Michail Di...

  29. [29]

    Kernel methods on riemannian manifolds with gaussian rbf kernels.IEEE transactions on pattern analysis and machine intelligence, 37(12):2464–2477, 2015

    Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, and Mehrtash Harandi. Kernel methods on riemannian manifolds with gaussian rbf kernels.IEEE transactions on pattern analysis and machine intelligence, 37(12):2464–2477, 2015

  30. [30]

    On the generalization of equivariance and convolution in neural networks to the action of compact groups

    Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. InInternational conference on machine learning, pages 2747–2755. PMLR, 2018. 11

  31. [31]

    Fast spherical fourier algorithms.Journal of Computational and Applied Mathematics, 161(1):75–98, 2003

    Stefan Kunis and Daniel Potts. Fast spherical fourier algorithms.Journal of Computational and Applied Mathematics, 161(1):75–98, 2003

  32. [32]

    Optimal entropy-transport problems and a new hellinger–kantorovich distance between positive measures.Inventiones mathematicae, 211(3):969–1117, 2018

    Matthias Liero, Alexander Mielke, and Giuseppe Savaré. Optimal entropy-transport problems and a new hellinger–kantorovich distance between positive measures.Inventiones mathematicae, 211(3):969–1117, 2018

  33. [33]

    Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data

    Xinran Liu, Yikun Bai, Rocio Diaz Martin, Kaiwen Shi, Ashkan Shahbazi, Bennett Allan Landman, Catie Chang, and Soheil Kolouri. Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data. InThe Thirteenth International Conference on Learning Representations, October 2024

  34. [34]

    Mardia and Peter E

    Kanti V . Mardia and Peter E. Jupp.Directional Statistics. Wiley Series in Probability and Statistics. Wiley, 2 edition, 2000

  35. [35]

    A novel sampling theorem on the sphere.IEEE Transactions on Signal Processing, 59(12):5876–5887, 2011

    Jason D McEwen and Yves Wiaux. A novel sampling theorem on the sphere.IEEE Transactions on Signal Processing, 59(12):5876–5887, 2011

  36. [36]

    Universal kernels.Journal of Machine Learning Research, 7(12), 2006

    Charles A Micchelli, Yuesheng Xu, and Haizhang Zhang. Universal kernels.Journal of Machine Learning Research, 7(12), 2006

  37. [37]

    Convergence rates for regularized unbalanced optimal transport: the discrete case.arXiv preprint arXiv:2507.07917, 2025

    Luca Nenna, Paul Pegon, and Louis Tocquec. Convergence rates for regularized unbalanced optimal transport: the discrete case.arXiv preprint arXiv:2507.07917, 2025

  38. [38]

    Introduction to entropic optimal transport.Lecture notes, Columbia University, 2021

    Marcel Nutz. Introduction to entropic optimal transport.Lecture notes, Columbia University, 2021

  39. [39]

    PyTorch: An imperative style, high-performance deep learning library

    Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An imperative style, high-perfo...

  40. [40]

    Computational optimal transport: With applications to data science.Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019

    Gabriel Peyré, Marco Cuturi, et al. Computational optimal transport: With applications to data science.Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019

  41. [41]

    Random coding strategies for minimum entropy.IEEE Transactions on Information Theory, 21(4):388–391, 2003

    Edward Posner. Random coding strategies for minimum entropy.IEEE Transactions on Information Theory, 21(4):388–391, 2003

  42. [42]

    Differentiable and accelerated spherical harmonic and wigner transforms.Journal of Computational Physics, 510:113109, 2024

    Matthew A Price and Jason D McEwen. Differentiable and accelerated spherical harmonic and wigner transforms.Journal of Computational Physics, 510:113109, 2024

  43. [43]

    Sliced optimal transport on the sphere

    Michael Quellmalz, Robert Beinert, and Gabriele Steidl. Sliced optimal transport on the sphere. Inverse Problems, 39(10):105005, August 2023. ISSN 0266-5611. doi: 10.1088/1361-6420/ acf156

  44. [44]

    Parallelly sliced optimal transport on spheres and on the rotation group.Journal of Mathematical Imaging and Vision, 66(6):951–976, 2024

    Michael Quellmalz, Léo Buecher, and Gabriele Steidl. Parallelly sliced optimal transport on spheres and on the rotation group.Journal of Mathematical Imaging and Vision, 66(6):951–976, 2024

  45. [45]

    Birkhäuser, 2015

    Filippo Santambrogio.Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling, volume 87. Birkhäuser, 2015

  46. [46]

    Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

    Thibault Séjourné, Jean Feydy, François-Xavier Vialard, Alain Trouvé, and Gabriel Peyré. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019

  47. [47]

    Serreze and Walter N

    Mark C. Serreze and Walter N. Meier. The Arctic’s sea ice cover: trends, variability, predictabil- ity, and comparisons to the Antarctic.Annals of the New York Academy of Sciences, 1436(1): 36–53, January 2019. doi: 10.1111/nyas.13856. Epub 2018 May 28

  48. [48]

    Convolutional wasserstein distances: Efficient optimal transportation on geometric domains.ACM Transactions on Graphics (ToG), 34(4):1–11, 2015

    Justin Solomon, Fernando De Goes, Gabriel Peyré, Marco Cuturi, Adrian Butscher, Andy Nguyen, Tao Du, and Leonidas Guibas. Convolutional wasserstein distances: Efficient optimal transportation on geometric domains.ACM Transactions on Graphics (ToG), 34(4):1–11, 2015. 12

  49. [49]

    Disco: accurate discrete scale convo- lutions.arXiv preprint arXiv:2106.02733, 2021

    Ivan Sosnovik, Artem Moskalev, and Arnold Smeulders. Disco: accurate discrete scale convo- lutions.arXiv preprint arXiv:2106.02733, 2021

  50. [50]

    Universal kernels via harmonic analysis on riemannian symmetric spaces

    Franziskus Steinert, Salem Said, and Cyrus Mostajeran. Universal kernels via harmonic analysis on riemannian symmetric spaces. InInternational Conference on Geometric Science of Information, pages 172–180. Springer, 2025

  51. [51]

    Unbalanced optimal transport, from theory to numerics

    Thibault Séjourné, Gabriel Peyré, and François-Xavier Vialard. Unbalanced optimal transport, from theory to numerics. In Emmanuel Trélat and Enrique Zuazua, editors,Numerical Con- trol: Part B, volume 24 ofHandbook of Numerical Analysis, pages 407–471. Elsevier, 2023. doi: https://doi.org/10.1016/bs.hna.2022.11.003. URL https://www.sciencedirect.com/ scie...

  52. [52]

    Stereographic Spherical Sliced Wasserstein Distances, February 2024

    Huy Tran, Yikun Bai, Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Rocio Diaz Martin, and Soheil Kolouri. Stereographic Spherical Sliced Wasserstein Distances, February 2024

  53. [53]

    S. R. S. Varadhan. On the behavior of the fundamental solution of the heat equation with variable coefficients.Communications on Pure and Applied Mathematics, 20(2):431–455, 1967

  54. [54]

    Climate error metrics based on wasserstein distances

    Carlos Veiga Rodrigues and Io Odderskov. Climate error metrics based on wasserstein distances. Applied Energy, 398, 2025

  55. [55]

    Springer, 2009

    Cédric Villani.Optimal Transport: Old and New. Springer, 2009

  56. [56]

    Evaluating the performance of climate models based on wasserstein distance.Geophysical Research Letters, 47(21), 2020

    Gabriele Vissio, Valerio Lembo, Valerio Lucarini, and Michael Ghil. Evaluating the performance of climate models based on wasserstein distance.Geophysical Research Letters, 47(21), 2020

  57. [57]

    Discovering climate change during the early 21st century via wasserstein stability analysis.Advances in Atmospheric Sciences, 42:373–381, 02 2025

    Zhiang Xie, Dongwei Chen, and Puxi Li. Discovering climate change during the early 21st century via wasserstein stability analysis.Advances in Atmospheric Sciences, 42:373–381, 02 2025

  58. [58]

    Flashsinkhorn: Io-aware entropic optimal transport.arXiv preprint arXiv:2602.03067, 2026

    Felix X-F Ye, Xingjie Li, An Yu, Ming-Ching Chang, Linsong Chu, and Davis Wertheimer. Flashsinkhorn: Io-aware entropic optimal transport.arXiv preprint arXiv:2602.03067, 2026. A Proofs A.1 Proof of Theorem 3.1 We will need some preliminaries. First, note that by Varadhan’s celebrated short-time asymptotics of the heat kernel, one has uniform convergence o...

  59. [59]

    As M is compact and d2 is continuous, an optimal coupling exists for the balanced OT problem [45, Thm. 1.4]. Henceπ ⋆ is in fact a coupling

  60. [60]

    For the upper bound, by the block approximation Lemma A.2, we get that the πη are couplings, which ensures that the proof goes through

  61. [61]

    (πε)ε>0 is in fact a family of couplings

    For the lower bound, one observes that regularized entropic OT admits a minimizer and that it is a coupling (see, e.g., [38, Theorem 4.2]) i.e. (πε)ε>0 is in fact a family of couplings. As a result, so are the sequence (πk)k≥k0 and its subsequence (πkn)n as well as the limit π of that subsequence. This ensures that the proof goes through. A.2 Proof of Pro...