Spherical Harmonic Optimal Transport: Application to Climate Models Comparisons
Pith reviewed 2026-05-20 12:53 UTC · model grok-4.3
The pith
Heat kernel costs converge to optimal transport costs as diffusion time vanishes on manifolds, yielding a fast spherical harmonic Sinkhorn algorithm on the sphere.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The heat kernel cost converges to the optimal transport cost as time vanishes in the balanced and unbalanced cases. In the specific case of the 2-sphere S^2, the associated Sinkhorn divergences retain the desirable geometric and analytic properties of classical optimal transport discrepancies. A fast Sinkhorn algorithm is derived requiring only O(n) memory and O(n^{3/2}) time per iteration with fully dense GPU-friendly operations.
What carries the argument
The heat kernel cost on the manifold, approximated via spherical harmonic truncation to enable fast convolution-based Sinkhorn iterations that preserve metric and positivity properties on S^2.
Load-bearing premise
The heat kernel approximation on the manifold converges to the true optimal transport cost as diffusion time approaches zero, and spherical harmonic truncation preserves the metric and positivity properties needed for Sinkhorn divergences on the sphere.
What would settle it
A computation on the sphere showing that the difference between the heat kernel Sinkhorn divergence and exact optimal transport cost fails to approach zero as the diffusion time parameter is driven toward zero for simple test measures.
Figures
read the original abstract
Optimal transport provides a powerful framework for comparing measures while respecting the geometry of their support, but comes with an expensive computational cost, hindering its potential application to real world use cases. On manifolds, convolutional algorithms based on the heat kernel have been proposed to alleviate this cost, but their theoretical properties remain largely unexplored. We establish that the heat kernel cost converges to the optimal transport cost as time vanishes in the balanced and unbalanced cases. In the specific case of the 2-sphere $\mathbb{S}^2$, we ensure that the associated Sinkhorn divergences retains the desirable geometric and analytic properties of classical optimal transport discrepancies. Moreover, we leverage the harmonic structure of the sphere to derive a fast Sinkhorn algorithm, requiring only $\mathcal{O}(n)$ memory and $\mathcal{O}(n^{3/2})$ time per iteration, with fully dense GPU-friendly operations. We validate its computational efficiency on synthetic data, and discuss its potential use in the evaluation of global climate models, providing both spatial and seasonal insights into models performances.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that a heat-kernel cost on manifolds converges to the optimal transport cost as diffusion time t vanishes (balanced and unbalanced cases). On the 2-sphere it asserts that the associated Sinkhorn divergences retain the geometric and analytic properties of classical OT, and derives a fast Sinkhorn algorithm via spherical-harmonic truncation of the heat kernel that uses O(n) memory and O(n^{3/2}) time per iteration with dense GPU-friendly operations. The method is validated on synthetic data and applied to spatial/seasonal comparisons of global climate models.
Significance. If the convergence result and the preservation of OT properties under the harmonic truncation both hold, the work would supply a computationally attractive, geometry-aware discrepancy for spherical data, with immediate relevance to climate-model intercomparison. The stated complexity improvement over dense OT solvers would be a practical contribution if the implementation and positivity guarantees are verified.
major comments (2)
- [Abstract and §4] Abstract and §4 (convergence statement): the manuscript asserts that the heat-kernel cost converges to the OT cost as t→0 in both balanced and unbalanced settings, yet supplies neither a proof sketch, error bounds, nor reference to supporting lemmas; because this limit is the central theoretical justification for using the kernel as a proxy, the absence of a derivation is load-bearing.
- [§5] §5 (spherical-harmonic truncation): the partial sum p_t^L = ∑_{l=0}^L e^{-l(l+1)t} (2l+1)/(4π) P_l(cos θ) is a finite Fourier series and therefore subject to Gibbs oscillations; for moderate L and t>0 the truncated kernel can take negative values off the diagonal. Negative entries render the cost C_t = −t log p_t^L undefined over the reals and destroy the positivity and symmetry required for the Sinkhorn divergence to inherit the geometric properties asserted in the abstract.
minor comments (2)
- [§6] The synthetic-data validation is described only qualitatively; quantitative tables comparing runtime and accuracy against standard Sinkhorn or entropic OT baselines on the same point sets would strengthen the efficiency claims.
- [§4] Notation for the unbalanced case (e.g., the precise form of the marginal penalties) should be stated explicitly when the convergence result is extended beyond the balanced setting.
Simulated Author's Rebuttal
We thank the referee for their careful and constructive review. The comments highlight important aspects of the theoretical justification and numerical stability that we will strengthen in the revision. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract and §4] Abstract and §4 (convergence statement): the manuscript asserts that the heat-kernel cost converges to the OT cost as t→0 in both balanced and unbalanced settings, yet supplies neither a proof sketch, error bounds, nor reference to supporting lemmas; because this limit is the central theoretical justification for using the kernel as a proxy, the absence of a derivation is load-bearing.
Authors: We agree that an explicit justification for the convergence of the heat-kernel cost to the OT cost as t→0 is essential. The manuscript claims this convergence in both balanced and unbalanced cases, but we acknowledge that a self-contained sketch or direct reference would make the argument more transparent. In the revised manuscript we will add a short derivation in §4 based on the well-known short-time asymptotic of the heat kernel on Riemannian manifolds (p_t(x,y) ∼ (4πt)^{-d/2} exp(−d_g(x,y)^2/(4t))), together with a citation to the relevant entropy-regularized OT literature. We will also include a qualitative discussion of the rate at which the approximation error vanishes. revision: yes
-
Referee: [§5] §5 (spherical-harmonic truncation): the partial sum p_t^L = ∑_{l=0}^L e^{-l(l+1)t} (2l+1)/(4π) P_l(cos θ) is a finite Fourier series and therefore subject to Gibbs oscillations; for moderate L and t>0 the truncated kernel can take negative values off the diagonal. Negative entries render the cost C_t = −t log p_t^L undefined over the reals and destroy the positivity and symmetry required for the Sinkhorn divergence to inherit the geometric properties asserted in the abstract.
Authors: We appreciate this observation on the possible negativity of the truncated kernel. The Gibbs phenomenon is indeed present for finite L. In the revised §5 we will (i) state explicit parameter regimes (L ≫ 1/√t) under which the truncated sum remains non-negative on the sphere, (ii) report numerical checks confirming positivity for the (L,t) pairs used in the climate-model experiments, and (iii) note that, when needed, a simple positive-part rectification can be applied without altering the O(n) memory or O(n^{3/2}) complexity. These additions will preserve the claimed geometric properties while keeping the algorithm practical. revision: partial
Circularity Check
No significant circularity; derivation chain is self-contained
full rationale
The paper claims to establish convergence of the heat kernel cost to OT as t vanishes (balanced/unbalanced cases) and to derive an O(n^{3/2}) Sinkhorn algorithm on S^2 via spherical-harmonic truncation of the heat kernel. These steps are presented as independent analytic results relying on standard properties of the heat kernel and spherical harmonics rather than any reduction to fitted parameters, self-definitional loops, or load-bearing self-citations. No quoted equations or sections reduce a claimed prediction or uniqueness result to the paper's own inputs by construction. The central guarantees are therefore not forced by the inputs.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Wasserstein Generative Adversarial Networks
Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein Generative Adversarial Networks. InProceedings of the 34th International Conference on Machine Learning, pages 214–223. PMLR, July 2017
work page 2017
-
[2]
Clément Bonet, Paul Berg, Nicolas Courty, François Septier, Lucas Drumetz, and Minh Tan Pham. Spherical Sliced-Wasserstein. InThe Eleventh International Conference on Learning Representations, September 2022
work page 2022
-
[3]
Slicing Unbalanced Optimal Transport.Transactions on Machine Learning Research, 2024
Clément Bonet, Kimia Nadjahi, Thibault Séjourné, Kilian Fatras, and Nicolas Courty. Slicing Unbalanced Optimal Transport.Transactions on Machine Learning Research, 2024
work page 2024
-
[4]
Spherical fourier neural operators: Learning stable dynamics on the sphere, 2023
Boris Bonev, Thorsten Kurth, Christian Hundt, Jaideep Pathak, Maximilian Baust, Karthik Kashinath, and Anima Anandkumar. Spherical fourier neural operators: Learning stable dynamics on the sphere, 2023
work page 2023
-
[5]
Cambridge University Press, 2023
Nicolas Boumal.An introduction to optimization on smooth manifolds. Cambridge University Press, 2023
work page 2023
-
[6]
JAX: composable transformations of Python+NumPy programs, 2018
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Yash Katariya, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman- Milne, and Qiao Zhang. JAX: composable transformations of Python+NumPy programs, 2018. URLhttp://github.com/jax-ml/jax
work page 2018
-
[7]
Maciej Buze and Manh Hong Duong. Entropic regularisation of unbalanced optimal transporta- tion problems.arXiv preprint arXiv:2305.02410, 2023
-
[8]
Guillaume Carlier, Vincent Duval, Gabriel Peyré, and Bernhard Schmitzer. Convergence of entropic schemes for optimal transport and gradient flows.SIAM Journal on Mathematical Analysis, 49(2):1385–1418, 2017
work page 2017
-
[9]
Guillaume Carlier, Paul Pegon, and Luca Tamanini. Convergence rate of general entropic optimal transport costs.Calculus of Variations and Partial Differential Equations, 62(4):116, 2023
work page 2023
-
[10]
Lénaïc Chizat, Gabriel Peyré, Bernhard Schmitzer, and François-Xavier Vialard. Scaling algorithms for unbalanced optimal transport problems.Mathematics of Computation, 87(314): 2563–2609, 2018. doi: 10.1090/mcom/3303
-
[11]
C. W. Clenshaw and A. R. Curtis. A method for numerical integration on an automatic computer. Numerische Mathematik, 2:197–205, 1960
work page 1960
-
[12]
Nicolas Courty, Rémi Flamary, Devis Tuia, and Alain Rakotomamonjy. Optimal Transport for Domain Adaptation.IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(9): 1853–1865, September 2017. ISSN 1939-3539. doi: 10.1109/TPAMI.2016.2615921
-
[13]
Sinkhorn distances: Lightspeed computation of optimal transport
Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. InAdvances in Neural Information Processing Systems 26, pages 2292–2300. Curran Associates, Inc., 2013
work page 2013
-
[14]
Nathaël Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega, and Salem Said. Invariant kernels on riemannian symmetric spaces: a harmonic-analytic approach.SIAM Journal on Mathematics of Data Science, 7(2):752–776, 2025
work page 2025
-
[15]
Kathryn P Drake and Grady B Wright. A fast and accurate algorithm for spherical harmonic analysis on healpix grids with applications to the cosmic microwave background radiation. Journal of Computational Physics, 416:109544, 2020. 10
work page 2020
-
[16]
J. R. Driscoll and D. M. Healy. Computing fourier transforms and convolutions on the 2-sphere. Advances in applied mathematics, 15(2):202–250, 1994
work page 1994
-
[17]
Lecture notes on symmetric spaces.preprint, 1997
Jost-Hinrich Eschenburg and To Renato. Lecture notes on symmetric spaces.preprint, 1997
work page 1997
- [18]
-
[19]
J. Faraut. Analysis on lie groups.Cambridge studies in advanced mathematics, 110, 2008
work page 2008
-
[20]
Geodesic exponential kernels: When curvature and linearity conflict
Aasa Feragen, Francois Lauze, and Soren Hauberg. Geodesic exponential kernels: When curvature and linearity conflict. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3032–3042, 2015
work page 2015
-
[21]
Interpolating between optimal transport and mmd using sinkhorn divergences
Jean Feydy, Thibault Séjourné, François-Xavier Vialard, Shun-ichi Amari, Alain Trouve, and Gabriel Peyré. Interpolating between optimal transport and mmd using sinkhorn divergences. In Kamalika Chaudhuri and Masashi Sugiyama, editors,Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, volume 89 ofProceeding...
work page 2019
-
[22]
Rémi Flamary, Nicolas Courty, Alexandre Gramfort, Mokhtar Z. Alaya, Aurélie Boisbunon, Sylvain Chambon, Laetitia Chapel, Adrien Corenflos, Kilian Fatras, Nemo Fournier, Léo Gautheron, Nathalie T. H. Gayraud, Hicham Janati, Alain Rakotomamonjy, Ievgen Redko, Antoine Rolet, Antony Schutz, Vivien Seguy, Danica J. Sutherland, Romain Tavenard, Alexander Tong, ...
work page 2021
-
[23]
Robert Garrett, Trevor Harris, Zhuo Wang, and Bo Li. Validating climate models with spherical convolutional wasserstein distance.Advances in Neural Information Processing Systems, 37: 59119–59149, 2024
work page 2024
-
[24]
Krzysztof M. Górski, Eric Hivon, A. J. Banday, B. D. Wandelt, Frode K. Hansen, Mstvos Reinecke, and Matthias Bartelmann. HEALPix: A framework for high-resolution discretization and fast analysis of data distributed on the sphere.The Astrophysical Journal, 622(2):759–771, 2005
work page 2005
-
[25]
American Mathemat- ical Soc., 2009
Alexander Grigoryan.Heat kernel and analysis on manifolds, volume 47. American Mathemat- ical Soc., 2009
work page 2009
-
[26]
Dennis M Healy Jr, Daniel N Rockmore, Peter J Kostelec, and Sean Moore. Ffts for the 2-sphere-improvements and variations.Journal of Fourier analysis and applications, 9(4): 341–385, 2003
work page 2003
-
[27]
Sigurdur Helgason.Differential geometry, Lie groups, and symmetric spaces, volume 80. Academic press, 1979
work page 1979
-
[28]
Hans Hersbach, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz- Sabater, Julien Nicolas, Carole Peubey, Raluca Radu, Dinand Schepers, Adrian Simmons, Cornel Soci, Saleh Abdalla, Xavier Abellan, Gianpaolo Balsamo, Peter Bechtold, Gionata Biavati, Jean Bidlot, Massimo Bonavita, Giovanna De Chiara, Per Dahlgren, Dick Dee, Michail Di...
work page 1999
-
[29]
Sadeep Jayasumana, Richard Hartley, Mathieu Salzmann, Hongdong Li, and Mehrtash Harandi. Kernel methods on riemannian manifolds with gaussian rbf kernels.IEEE transactions on pattern analysis and machine intelligence, 37(12):2464–2477, 2015
work page 2015
-
[30]
Risi Kondor and Shubhendu Trivedi. On the generalization of equivariance and convolution in neural networks to the action of compact groups. InInternational conference on machine learning, pages 2747–2755. PMLR, 2018. 11
work page 2018
-
[31]
Stefan Kunis and Daniel Potts. Fast spherical fourier algorithms.Journal of Computational and Applied Mathematics, 161(1):75–98, 2003
work page 2003
-
[32]
Matthias Liero, Alexander Mielke, and Giuseppe Savaré. Optimal entropy-transport problems and a new hellinger–kantorovich distance between positive measures.Inventiones mathematicae, 211(3):969–1117, 2018
work page 2018
-
[33]
Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data
Xinran Liu, Yikun Bai, Rocio Diaz Martin, Kaiwen Shi, Ashkan Shahbazi, Bennett Allan Landman, Catie Chang, and Soheil Kolouri. Linear Spherical Sliced Optimal Transport: A Fast Metric for Comparing Spherical Data. InThe Thirteenth International Conference on Learning Representations, October 2024
work page 2024
-
[34]
Kanti V . Mardia and Peter E. Jupp.Directional Statistics. Wiley Series in Probability and Statistics. Wiley, 2 edition, 2000
work page 2000
-
[35]
Jason D McEwen and Yves Wiaux. A novel sampling theorem on the sphere.IEEE Transactions on Signal Processing, 59(12):5876–5887, 2011
work page 2011
-
[36]
Universal kernels.Journal of Machine Learning Research, 7(12), 2006
Charles A Micchelli, Yuesheng Xu, and Haizhang Zhang. Universal kernels.Journal of Machine Learning Research, 7(12), 2006
work page 2006
-
[37]
Luca Nenna, Paul Pegon, and Louis Tocquec. Convergence rates for regularized unbalanced optimal transport: the discrete case.arXiv preprint arXiv:2507.07917, 2025
-
[38]
Introduction to entropic optimal transport.Lecture notes, Columbia University, 2021
Marcel Nutz. Introduction to entropic optimal transport.Lecture notes, Columbia University, 2021
work page 2021
-
[39]
PyTorch: An imperative style, high-performance deep learning library
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Köpf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. PyTorch: An imperative style, high-perfo...
work page 2019
-
[40]
Gabriel Peyré, Marco Cuturi, et al. Computational optimal transport: With applications to data science.Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019
work page 2019
-
[41]
Edward Posner. Random coding strategies for minimum entropy.IEEE Transactions on Information Theory, 21(4):388–391, 2003
work page 2003
-
[42]
Matthew A Price and Jason D McEwen. Differentiable and accelerated spherical harmonic and wigner transforms.Journal of Computational Physics, 510:113109, 2024
work page 2024
-
[43]
Sliced optimal transport on the sphere
Michael Quellmalz, Robert Beinert, and Gabriele Steidl. Sliced optimal transport on the sphere. Inverse Problems, 39(10):105005, August 2023. ISSN 0266-5611. doi: 10.1088/1361-6420/ acf156
-
[44]
Michael Quellmalz, Léo Buecher, and Gabriele Steidl. Parallelly sliced optimal transport on spheres and on the rotation group.Journal of Mathematical Imaging and Vision, 66(6):951–976, 2024
work page 2024
-
[45]
Filippo Santambrogio.Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling, volume 87. Birkhäuser, 2015
work page 2015
-
[46]
Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019
Thibault Séjourné, Jean Feydy, François-Xavier Vialard, Alain Trouvé, and Gabriel Peyré. Sinkhorn divergences for unbalanced optimal transport.arXiv preprint arXiv:1910.12958, 2019
-
[47]
Mark C. Serreze and Walter N. Meier. The Arctic’s sea ice cover: trends, variability, predictabil- ity, and comparisons to the Antarctic.Annals of the New York Academy of Sciences, 1436(1): 36–53, January 2019. doi: 10.1111/nyas.13856. Epub 2018 May 28
-
[48]
Justin Solomon, Fernando De Goes, Gabriel Peyré, Marco Cuturi, Adrian Butscher, Andy Nguyen, Tao Du, and Leonidas Guibas. Convolutional wasserstein distances: Efficient optimal transportation on geometric domains.ACM Transactions on Graphics (ToG), 34(4):1–11, 2015. 12
work page 2015
-
[49]
Disco: accurate discrete scale convo- lutions.arXiv preprint arXiv:2106.02733, 2021
Ivan Sosnovik, Artem Moskalev, and Arnold Smeulders. Disco: accurate discrete scale convo- lutions.arXiv preprint arXiv:2106.02733, 2021
-
[50]
Universal kernels via harmonic analysis on riemannian symmetric spaces
Franziskus Steinert, Salem Said, and Cyrus Mostajeran. Universal kernels via harmonic analysis on riemannian symmetric spaces. InInternational Conference on Geometric Science of Information, pages 172–180. Springer, 2025
work page 2025
-
[51]
Unbalanced optimal transport, from theory to numerics
Thibault Séjourné, Gabriel Peyré, and François-Xavier Vialard. Unbalanced optimal transport, from theory to numerics. In Emmanuel Trélat and Enrique Zuazua, editors,Numerical Con- trol: Part B, volume 24 ofHandbook of Numerical Analysis, pages 407–471. Elsevier, 2023. doi: https://doi.org/10.1016/bs.hna.2022.11.003. URL https://www.sciencedirect.com/ scie...
-
[52]
Stereographic Spherical Sliced Wasserstein Distances, February 2024
Huy Tran, Yikun Bai, Abihith Kothapalli, Ashkan Shahbazi, Xinran Liu, Rocio Diaz Martin, and Soheil Kolouri. Stereographic Spherical Sliced Wasserstein Distances, February 2024
work page 2024
-
[53]
S. R. S. Varadhan. On the behavior of the fundamental solution of the heat equation with variable coefficients.Communications on Pure and Applied Mathematics, 20(2):431–455, 1967
work page 1967
-
[54]
Climate error metrics based on wasserstein distances
Carlos Veiga Rodrigues and Io Odderskov. Climate error metrics based on wasserstein distances. Applied Energy, 398, 2025
work page 2025
- [55]
-
[56]
Gabriele Vissio, Valerio Lembo, Valerio Lucarini, and Michael Ghil. Evaluating the performance of climate models based on wasserstein distance.Geophysical Research Letters, 47(21), 2020
work page 2020
-
[57]
Zhiang Xie, Dongwei Chen, and Puxi Li. Discovering climate change during the early 21st century via wasserstein stability analysis.Advances in Atmospheric Sciences, 42:373–381, 02 2025
work page 2025
-
[58]
Flashsinkhorn: Io-aware entropic optimal transport.arXiv preprint arXiv:2602.03067, 2026
Felix X-F Ye, Xingjie Li, An Yu, Ming-Ching Chang, Linsong Chu, and Davis Wertheimer. Flashsinkhorn: Io-aware entropic optimal transport.arXiv preprint arXiv:2602.03067, 2026. A Proofs A.1 Proof of Theorem 3.1 We will need some preliminaries. First, note that by Varadhan’s celebrated short-time asymptotics of the heat kernel, one has uniform convergence o...
work page internal anchor Pith review arXiv 2026
-
[59]
As M is compact and d2 is continuous, an optimal coupling exists for the balanced OT problem [45, Thm. 1.4]. Henceπ ⋆ is in fact a coupling
-
[60]
For the upper bound, by the block approximation Lemma A.2, we get that the πη are couplings, which ensures that the proof goes through
-
[61]
(πε)ε>0 is in fact a family of couplings
For the lower bound, one observes that regularized entropic OT admits a minimizer and that it is a coupling (see, e.g., [38, Theorem 4.2]) i.e. (πε)ε>0 is in fact a family of couplings. As a result, so are the sequence (πk)k≥k0 and its subsequence (πkn)n as well as the limit π of that subsequence. This ensures that the proof goes through. A.2 Proof of Pro...
work page 2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.